| 54 | == A vision to publish and download valid PIMO ontologies == |
| 55 | |
| 56 | This is an idea that should be implemented on top of Nepomuk to keep our ontologies valid. |
| 57 | |
| 58 | Users want to import as many RDF as possible into their semantic desktop. For example, they send each other RDF via e-mail or download descriptions of projects (DOAP) from websites or download FOAF files from websites, etc. We do not want to forbid the import of any RDF, but on the other hand invalid RDF breaks the inferencer and breaks other parts (for example invalid files cannot be removed so easily). So Leo recommends to use a '''quarantine for imported RDF''', a contamination barrier in front of the desktop store that keeps invalid RDF out. This quarantine works in a way that users can put imported RDF in the quarantine and let a program run on it until the new RDF is valid. |
| 59 | |
| 60 | The approach to import ontologies from outside sources would be: |
| 61 | * Test if the new RDF is valid PIMO rdf. If yes, allow the import of the RDF, using the provenance information of the graph as named graph in the PimoStore. |
| 62 | * If the new file is invalid, start a semi-automatic import assistant. This assistant tries to fix as many things as possible using the following heuristics: |
| 63 | * check what ontology language the new RDF uses: OWL or RDFS. If this is detected, use some RDFS and OWL specific transformation scripts as preprocessing |
| 64 | * for any invalidity, see if there is a default way to fix it |
| 65 | * determine the imported ontologies by looking at the namespaces |
| 66 | * download imported ontologies by the namespaces |
| 67 | * Present the user a status of the import assistant, saying what steps where taken to make the RDF valid. |
| 68 | * If the RDF is still invalid, offer actions to make the RDF valid. Such actions can be |
| 69 | * write an import script that fixes the errors |
| 70 | * search for import scripts that fixed these bugs before |
| 71 | * express graph transformation using a SPARQL construct-like language. This should allow to replace, delete, or add triples. |
| 72 | * at the end, save all actions taken into an import script that summarizes the actions that need to be taken to import RDF from the given source. |
| 73 | * the valid RDF is imported. |
| 74 | |
| 75 | As the user is assisted through all these steps, and intelligent default values are entered beforehand, it may be that many RDF graphs can be made valid through this import assistant. We hope that this approach keeps invalid RDF out of the store and on the other hand lets users import as many external RDF sources as possible. Similar to piggy-bank, scripts are needed for this task. |
| 76 | |
| 77 | The interesting part is now, that this import assistant can be realised as centralised online (web 2.0 like) application. Users can let the online service "pimo transformation server" run its magic on any RDF they find in the net. If one user writes a useful import script for, say, FOAF, then the script is stored at the transformation server. So the transformation scripts are "user generated content" and shared within the Gnowsis/Nepomuk commmunity. A core element in this approach are tools like the "Exobot" (source:branches/gnowsis0.9/gnowsis-server/src/java/org/gnowsis/exobot/Exobot.java) or Haystack's adenine programming language. The scripts to transform RDF from one state to another can be written using a combination of SPARQL, inference rules, and other operations. There is no need to install a runtime for this language, one installation of the transformation engine at the transformation server is a good start. |
| 78 | |
| 79 | The goal here is that users can import as much RDF as possible, and if one user found out how to transform data into valid PIMO, then this "how-to" information is stored into a script, that can be used by the next user. Based on paths of source files or classes found inside the new rdf, it is easy to program a case-based-reasoning machine that suggests which script may be used to make a file valid. |
| 80 | |
| 81 | The question is: does this approach bring us to our goal? |