Context Navigation

Changes between Version 13 and Version 14 of PimoService

Timestamp:: 07/21/06 14:06:09 (19 years ago)
Author:: sauermann
Comment:: added idea of transformation server

Legend:

: Unmodified
: Added
: Removed
: Modified

PimoService

-                      v13
+                      v14
 The rules work only, when the language constructs and upper ontology are part of the model that is validated. For example, validating Paul’s PIMO is only possible when the PIMO-Basic and PIMO-Upper is available to the inference engine, otherwise the definition of the basic classes and properties are missing. The validation can be used to restrict updates to the data model in a way that only valid data can be stored into the database. Or, the model can be validated on a regular basis after the changes were made. In the gnowsis prototype, validation was activated during automatic tests of the system, to verify that the software generates valid data in different situations. Ontologies are also validated during import to the ontology store. Before validating a new ontology, it’s import declarations have to be satisfied. The test begins by building a temporal ontology model, where first the ontology under test and then all imported ontologies are added. If an import cannot be satisfied, because the required ontology is not already part of the system, either the missing part could be fetched from the internet using the ontology identifier as URL, or the user can be prompted to import the missing part first. When all imports are satisfied, the new ontology under test is validated and added to the system. A common mistake at this point is to omit the PIMO-Basic and PIMO-Upper import declarations. By using this strict testing of ontologies, conceptual errors show at an early stage. Strict usage of import-declarations makes dependencies between ontologies explicit, whereas current best practice in the RDF/S based semantic web community has many implicit imports that are often not leveraged.
+== A vision to publish and download valid PIMO ontologies ==
+This is an idea that should be implemented on top of Nepomuk to keep our ontologies valid.
+Users want to import as many RDF as possible into their semantic desktop. For example, they send each other RDF via e-mail or download descriptions of projects (DOAP) from websites or download FOAF files from websites, etc. We do not want to forbid the import of any RDF, but on the other hand invalid RDF breaks the inferencer and breaks other parts (for example invalid files cannot be removed so easily). So Leo recommends to use a '''quarantine for imported RDF''', a contamination barrier in front of the desktop store that keeps invalid RDF out. This quarantine works in a way that users can put imported RDF in the quarantine and let a program run on it until the new RDF is valid.
+The approach to import ontologies from outside sources would be:
+ * Test if the new RDF is valid PIMO rdf. If yes, allow the import of the RDF, using the provenance information of the graph as named graph in the PimoStore.
+ * If the new file is invalid, start a semi-automatic import assistant. This assistant tries to fix as many things as possible using the following heuristics:
+  * check what ontology language the new RDF uses: OWL or RDFS. If this is detected, use some RDFS and OWL specific transformation scripts as preprocessing
+  * for any invalidity, see if there is a default way to fix it
+  * determine the imported ontologies by looking at the namespaces
+  * download imported ontologies by the namespaces
+ * Present the user a status of the import assistant, saying what steps where taken to make the RDF valid.
+ * If the RDF is still invalid, offer actions to make the RDF valid. Such actions can be
+  * write an import script that fixes the errors
+  * search for import scripts that fixed these bugs before
+  * express graph transformation using a SPARQL construct-like language. This should allow to replace, delete, or add triples.
+ * at the end, save all actions taken into an import script that summarizes the actions that need to be taken to import RDF from the given source.
+ * the valid RDF is imported.
+As the user is assisted through all these steps, and intelligent default values are entered beforehand, it may be that many RDF graphs can be made valid through this import assistant. We hope that this approach keeps invalid RDF out of the store and on the other hand lets  users import as many external RDF sources as possible. Similar to piggy-bank, scripts are needed for this task.
+The interesting part is now, that this import assistant can be realised as centralised online (web 2.0 like) application. Users can let the online service "pimo transformation server" run its magic on any RDF they find in the net. If one user writes a useful import script for, say, FOAF, then the script is stored at the transformation server. So the transformation scripts are "user generated content" and shared within the Gnowsis/Nepomuk commmunity. A core element in this approach are tools like the "Exobot" (source:branches/gnowsis0.9/gnowsis-server/src/java/org/gnowsis/exobot/Exobot.java) or Haystack's adenine programming language. The scripts to transform RDF from one state to another can be written using a combination of SPARQL, inference rules, and other operations. There is no need to install a runtime for this language, one installation of the transformation engine at the transformation server is a good start.
+The goal here is that users can import as much RDF as possible, and if one user found out how to transform data into valid PIMO, then this "how-to" information is stored into a script, that can be used by the next user. Based on paths of source files or classes found inside the new rdf, it is easy to program a case-based-reasoning machine that suggests which script may be used to make a file valid.
+The question is: does this approach bring us to our goal?