| 40 | * First you have to decide what configuration options your datasource will take. Have a look at [http://aperture.sourceforge.net/ontology/source.rdfs the aperture datasource schema] for a selection. In my case I will use source:rootURI to specify what RSS feed to crawl. Use the aperture utility method to get this: |
| 41 | {{{ |
| 42 | RDFContainer config=source.getConfiguration(); |
| 43 | String root=ConfigurationUtil.getRootUrl(config); |
| 44 | }}} |
| 45 | You are of course free to make up any config properties you want, but then the ConfigurationUtil class might not help you. |
| 46 | * Implementing the crawlObjects method is clearly quite datasource dependent, in my case I add the rome jar and jdom, and copy some example for how to read a feed. Some other hints: |
| 47 | ** Gnowsis uses java.util.logging for logging, so to get useful debugging message add this the top of your file: |
| 48 | {{{ |
| 49 | Logger log=Logger.getLogger(RSSCrawler.class.getName()); |
| 50 | }}} |
| 51 | ** The return value of crawlObjects is taken from ExitCode, it has predefined values for you. |