Changes between Version 6 and Version 7 of ApertureDataAccessor


Ignore:
Timestamp:
10/18/05 20:52:54 (19 years ago)
Author:
sauermann
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ApertureDataAccessor

    v6 v7  
    2727 * The !CrawlData interface allows us in the future to get rid of the !CrawlDataBase implementation class, which has its own storage format, and create an adapter that works on top of the Sesame Repository that also contains all extracted metadata. This way all known metadata of a resource is stored in a single place, ensuring consistency, lower resource comsumption, improved caching behaviour, etc. 
    2828 
     29 * Leo: to simplify and seperate the '''get (=get it now!)''' and '''getCrawl (=check if changed, get if changed)''' I would suggest to define two methods,  one for really getting a resource and one in the crawling scenario. The getCrawl method would be the existing one, the get method a simpler one. 
     30 
    2931 
    3032== Java Interface == 
     
    3638  * A DataAccessor provides access to physical resources by creating DataObjects 
    3739  * representing the resource, based on a url and optionally data about a previous access 
    38   * and other parameters. 
     40  * and other parameters.  
     41  * The main task of a DataAccessor is to find the resource identified by the URL String 
     42  * and create a DataObject that represents the resource. When crawling, the DataAccessor 
     43  * additionally uses the passed CrawlData interface to check and update information about 
     44  * the last crawl.  
     45  * About the returned DataObject: i n most cases, the DataObject is just a passive container 
     46  * of information, the DataAccessor will have filled it with information. However, it may  
     47  * also have returned a dedicated DataObject implementation that determines some things  
     48  * dynamically, that is up to the DataAccessor to decide.  
    3949  */ 
    4050public interface DataAccessor { 
     
    5767         * @param uri         The uri used to address the resource. 
    5868         * @param dataSource  The source that will be registered as the source of the DataObject. 
    59          * @param accessData  Optional database containing information about previous accesses. 
     69         * @param crawlData   Optional database containing information about previous accesses. 
    6070         * @param params      Optional additional parameters needed to access the physical resource. 
     71         *                  also, parameters may be passed that determine how the metadata should be  
     72                            extracted or which detail 
     73         *                  of metadata is needed. Applications may pass params through the whole chain. 
    6174         * @return A DataObject for the specified URI, or null when an AccessData instance has been 
    6275         * specified and the binary resource has not been modified since the last access. 
     
    6578         */ 
    6679        public DataObject get(URI uri, DataSource source, 
    67             AccessData accessData, Map<?,?> params) throws UriNotFoundException, IOException; 
     80            CrawlData crawlData , Map<?,?> params) throws UriNotFoundException, IOException; 
    6881} 
    6982}}}