DataAccessors
Java Interface
/**
* A DataAccessor provides access to physical resources by creating DataObjects
* representing the resource, based on a url and optionally data about a previous access
* and other parameters.
*/
public interface DataAccessor {
/**
* Get a DataObject for the specified url. The resulting DataObject's ID may differ
* from the specified url due to normalization schemes, following of redirected URLs, etc.
*
* An AccessData instance can optionally be specified with which the DataAccessor can store
* and retrieve information about previous accesses to resources. This is mostly useful
* for DataCrawlers who want to be able to incrementally scan a DataSource.
* When an AccessData instance is specified, the resulting DataObject can be null,
* indicating that the binary resource has not been modified since the last access.
*
* A DataAccessor is always required to store something in the AccessData when a
* url is accessed, so that afterwards AccessData.isKnownId will return true.
*
* Specific DataAccessor implementations may accept additional parameters through the params Map.
*
* @param url The url used to address the resource.
* @param dataSource The source that will be registered as the source of the DataObject.
* @param accessData Optional database containing information about previous accesses.
* @param params Optional additional parameters needed to access the physical resource.
* @return A DataObject for the specified URI, or null when an AccessData instance has been
* specified and the binary resource has not been modified since the last access.
* @throws UrlNotFoundException when the binary resource could not be found
* @throws IOException When any other kind of I/O error occurs.
*/
public DataObject get(String url, DataSource source,
AccessData accessData, Map<?,?> params) throws UrlNotFoundException, IOException;
}