== Aperture: Semantic Data Access by Aduna & DFKI == Goals: * To extract individual data objects (e.g. documents, emails, ...) from various data sources * To extract all possible information from the binary content of these data objects (e.g. full text, titles, authors, ...) * To deliver a storage back-end in which this information can be stored and made queryable * To deliver an architecture that can easily be extended by others, e.g. with new document formats, data source types, ... The software distribution package will contain all relevant information about semantic data extraction, everything that is needed to get starting with a full-text and metadata extraction framework. Our intent is that developers can download a single distribution file with a fully working environment, that also includes all available adapter and extractor implementations. == General == [wiki:ApertureOverview Project Overview] [wiki:ApertureLicense License] [wiki:ApertureCredits Credits] == Architecture == * [wiki:ApertureArchitecture DataSource Architecture] * [wiki:ApertureExtractor Extractors] * [wiki:ApertureArchives Archives] * [wiki:ApertureEmailInterpretation Email Interpretation] * [wiki:ApertureOpeningDocuments Opening Documents] * [wiki:ApertureRDF The Use of RDF] * [wiki:ApertureOSGi The Use of OSGi] == API Development == * [wiki:ApertureDataSource DataSource] * [wiki:ApertureDataObject DataObject] * [wiki:ApertureDataCrawler DataCrawler] * [wiki:ApertureDataAccessor DataAccessor] * [wiki:ApertureDataOpener DataOpener] - suggested! * [wiki:ApertureScanReport ScanReport] * [wiki:ArchiveExtractor ArchiveExtractor] - suggested!