Changes between Version 9 and Version 10 of ApertureSimpleDataCrawler


Ignore:
Timestamp:
10/17/05 14:31:21 (19 years ago)
Author:
anonymous
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ApertureSimpleDataCrawler

    v9 v10  
    7373Here's a new idea that in my opinion merges this idea with our own architecture. Create a super interface of !DataObject (Resource? - has a strong RDF association. Entity? - has other associations here at Aduna). !DataObject then gets a sibling named Folder. Crawlers do not only produce !DataObject instances, they produce instances of its supertype. This way, crawlers that crawl data sources with an intrinsic hierarchy can return Folder instances, which contain all metadata of the Folder, similar to how !DataObjects contain metadata of that object. Similarly, we can introduce other !DataObject siblings for capturing table- or graph-related metadata that is not specific to a single !DataObject. Crawler-using applications that have no interest in this information can simply ignore these events. Also, the crawler interface itself does not need to specify folder-/graph-/table-specific information. 
    7474 
     75In our use case this also facilitates metadata indexing in because currently our !MetadataFetcher (the class transforming the information inside a !DataObject to RDF statements) interprets the document URIs and "reinvents" the folder hierarchy, modeling it as Resources with a partOf relation. This would then no longer be necessary, the Folder instance would already contain all necessary information. 
     76 
    7577Leo: ok, the idea of having a kind of "Sub-Class" sounds good. I would make ApertureDataObject the parent of ApertureDataFolderObject, so that ApertureDataFolderObject has all properties of DataObject and more. 
    7678 
    77 In our use case this also facilitates metadata indexing in because currently our !MetadataFetcher (the class transforming the information inside a !DataObject to RDF statements) interprets the document URIs and "reinvents" the folder hierarchy, modeling it as Resources with a partOf relation. This would then no longer be necessary, the Folder instance would already contain all necessary information. 
     79Chris: I don't think so, as a DataObject also has properties that a Folder does not have. For example, a Folder has no InputStream, no byte size, etc. I think DataSources and Folders are really something different. They surely do share characteristics (they both have a URI and metadata) but this should be expressed by their super type. One should not inherit from the other.