wiki:ApertureSimpleDataCrawler

Version 1 (modified by sauermann, 19 years ago) (diff)

--

The ApertureSimpleDataCrawler is responsible for a simple access to structured data sources.

Instances of this interface would be classes like FileDataSource, IMAPDataSource, OutlookDataSource

public interface SimpleDataCrawler {


 /**
  * init the datasource passing variables like name, base path, passwords, server hostname, etc
  */
 public void init(Map parameters);

 /**
  * open the passed data object so that it can be viewed / edited by the user. for a file,
  * this would mean that the operating system opens the file, for an address book entry
  * the address book application would have to start
  */
 public void openObject(String uri);

 /**
  * get the root uri of this datasource
  */
 public String getRootUri();

 /**
  * get the detailed data of one object, including plaintext and metadata 
  * this is costly.
  * This may (internally) make heavy reuse of Extractors
  */
 public Map getDataOfObject(String uri);

 /**
  * List sub-folders, Iterator contains folder uris as Strings.
  * this may also return the uris of objects, if the objects can contain sub-objects. 
  * (IMAP-attachments)-but this is bad as detection of sub-objects of emails is costly.
  * the first call of this method would be with the getRootUri()
  */
 public Iterator listSubFolders(String uri);

 /**
  * List objects inside the passed folder, Iterator contains folder uris of objects as Strings.
  * the first call of this method would be with the getRootUri()
  */
 public Iterator listSubObjects(String uri);

 /**
  * get a map of metadata about the passed object, 
  * enough so that changes can be detected.
  * if one value in this map has changed compared to the previously returned map (in the last scan)
  * than getDataOfObject is called to get the current data.
  */
 public Map getChangeDataOfObject(String uri);

}