This will be implemented by Daniel Burkhart in ticket:92 at the moment, we have two major features and key factors in gnowsis: 1. fast storage of masses of RDF in a quad-store (context-aware triplestore) and having this with SPARQL available 2. fulltext search 3. publishing this RDF store with an API that can be easily programmed. At the moment, these three goals are both realised using Jena and home-grown software. The first features is implemented using Jena and SPARQL2SQL. SPARQL2SQL is not maintained anymore and has serious performance problems on insert, especially when we activate our bad hack for mysql. The second solution at the moment is a mysql hack. We also had to implement client and server APIs to support the third feature. at the moment we have: 1. Jena and SPARQL2SQL 2. a MYSQL hack that crashes from time to time 3. an API that is expensive to maintain and to use - the gnowsis Repository and CentralHub? API So, by switching to Sesame2 we hope to have a more durable solution in the future, because: 1. sesame supports a triple store and quad store in sesame2 2. sesame supports lucene SAILs (or will support them soon) to enable fulltext search 3. sesame has a well-known API that can be used from many applications 4. sesame does not depend on any third party apps like MySQL - their native SAIL is said to be performant and scalable Especially the last point will have spare us some trouble. So we will still need something for the CentralHub? API but the repository and ontManager APIs can be replaces or enhanced completely using Sesame2. Another reason to switch to Sesame2 is our dedication to Aperture. BEFORE this we evaluate Sesame2 has to be stressed-tested before, reagrding its SPARQL capabilities, etc: * SPARQL * QUADS * Fulltext-search ( or with catwiesel ) * fast inserting of data = Steps to do - Sesame2 = 1. learning Sesame2 1. We need an architecture overview (graphical) 1. which parts in gnowsis have to be replaced? (which functions are important?) 1. which parts of gnowsis can be deleted? 1. do everything from scratch 1. API'S have to be implemented: == learning Sesame2 (4 hours -> 19.01.2006) == * get it from Sourceforge-CVS, project: openRDF, [http://cvs.sourceforge.net/viewcvs.py/sesame/openrdf/] * using Sesame Server or Sesame Library? * Useful Documentation: [http://www.openrdf.org/doc/sesame/users/userguide.html#chapter-api], specially chapter 7 seems to be interesting * writing some basic examples like: 1. creating/accessing a repository 1. adding RDF data to repository 1. querying a repository == graphical architecture overview (4 hours - 18.01.2006) == == which parts of gnowsis have to be replaced? (2 hours - 12.01.2006) == Currently gnowsis uses Jena as Storage and Sparql for querying the Jena storage from the MySQL database.In which packages is this implemented? What is still needed? == which parts of gnowsis can be deleted? (2 hours - 12.01.2006) == most code inside this package can be removed: source:trunk/gnowsis/src/org/gnowsis/repository == do everything from scratch (? hours - Gnowsis weekend) == * how to start? * implement the APIs * integration to Gnowsis/GUI == API's have to be implemented (? hours - Gnowsis weekend) == * Central Repository * Ontology Manager * Gnowsis Search * Central Hub = Steps to do - Aperture = 1. learning aperture 1. Aperture will displace some packages from Gnowsis 1. start, stop, crawl? 1. integration with/to sesame? 1. Outlook adapter == learning aperture (4 hours) == project page: [http://aperture.sourceforge.net/] == Aperture will displace some packages from Gnowsis == This will be the package source:trunk/gnowsis/src/org/gnowsis/adapters and source:trunk/gnowsis/src/org/gnowsis/data == start, stop, crawl == == integration with/to sesame == == outlook adapter == = Result = Everything to Gnowsis Server project