Version 46 (modified by sauermann, 19 years ago) (diff) |
---|
Aperture: Semantic Data Access by Aduna & DFKI
Goals:
- To extract individual data objects (e.g. documents, emails, ...) from various data sources
- To extract all possible information from the binary content of these data objects (e.g. full text, titles, authors, ...)
- To deliver a storage back-end in which this information can be stored and made queryable
- To deliver an architecture that can easily be extended by others, e.g. with new document formats, data source types, ...
The software distribution package will contain all relevant information about semantic data extraction, everything that is needed to get starting with a full-text and metadata extraction framework. Our intent is that developers can download a single distribution file with a fully working environment, that also includes all available adapter and extractor implementations.
General
Architecture
- DataSource Architecture
- Extractors
- Archives
- Email Interpretation
- Opening Documents
- The Use of RDF
- The Use of OSGi
API Development
- ApertureDataSource
- ApertureDataObject
- ApertureDataCrawler
- ApertureDataCrawlerListener
- ApertureDataAccessor
- ApertureAccessData?
- ApertureDataOpener - suggested!
- ApertureScanReport
- ApertureArchiveExtractor - suggested!
- ApertureExtractor - suggested!
- ApertureRDFMap
- ApertureSimpleDataCrawler - The other extreme: a simple data crawler that leaves the detection of changes to the outside.
Attachments (2)
-
aperture_overview.ppt
(35.0 KB) -
added by sauermann 19 years ago.
Rough-cut overview of the framework
- API changes (20051114).txt (9.5 KB) - added by chris 19 years ago.
Download all attachments as: .zip