Version 2 (modified by anonymous, 19 years ago) (diff) |
---|
Extractors
This API is still under discussion, that's why I shipped the older TextExtractor? implementations to DFKI.
The purpose of Extractor is to extract all information (full text and other) from an InputStream? of a specific document. Extractors are therefore mimetype-specific.
Todo: describe and discuss final API