If you are familiar with TopBraid Composer and are starting to use TopBraid EDG, one of your questions may be “How do I bring ontologies I have been developing in TBC into EDG?”
Since EDG assigns a specific meaning to the term ontology and TBC does not, to move your content from TBC to EDG requires a few steps. We will describe these steps below, but first let’s examine in more detail the differences between ontologies in EDG and ontologies in TBC.
TopBraid EDG:
- Distinguishes between different types of graphs depending on what information they contain:
- Each graph (or asset collection) type has a name.
- Ontology is one of the supported types.
- Ontology collection type is special in that ontologies in EDG must contains data model/schema definitions – classes, properties, shapes and rules. Other collection types contain only instance data.
- In general, each asset collection type presumes certain data – meaning data based on certain ontologies. Some types are targeted specifically for containing data based on ontologies shipped with EDG e.g., taxonomies or data asset collections or corpora. Some are based on a user defined ontology and will ask you for the ontology to use when you create them.
- Thus, an ontology in EDG is an asset collection that is managed by EDG and has type equal to EDG Ontology.
- Puts all information managed by EDG into an RDF database also known as EDG Repository.
- EDG is a multi-user environment and does not let users modify information in files.
- EDG will coordinate changes made by multiple users. All changes are immediately posted to a repository.
- Requires the use of W3C data modeling standard SHACL in order to work with instance data i.e., graphs that are not ontologies.
- Tightly manages identity of each asset collection in EDG.
- It automatically assigns each a URI using an internal algorithm for making such assignments.
- Users can’t change identity of an asset collection.
TopBraid Composer:
- Only distinguishes two types of graphs: those that contain models and/or data and those that contain application assets.
- For example, since TopBraid Composer is used for application development, it supports creation of SWP and SPARQLMotion scripts. These scripts are saved as RDF. When you create New in TBC, you can tell it that the new file is expected to be a script. TBC will then add to it all necessary imports (e.g., SWP and SPARQLMotion namespaces) and it will also adjust the file extension accordingly.
- If a file is not an application asset but is intended to contain a model or data based on a model, TBC will simply create RDF file for you to use. It does not understand EDG asset collection types.
- Doesn’t have a firm, definitive notion of what is an ontology.
- Everything is a graph. What you put in a given graph is up to you.
- Colloquially, one could refer to any graph as an ontology.
- Works with files and assumes a single user environment.
- You can create a file that is a connector to a database and then information you create in TBC will go into a database, but there is no notion of a common repository.
- Changes made by a TBC user will be held in TBC’s memory. The paradigm used is similar to working with files in let’s say MS Office. Changes get saved when a user presses Save button.
- If files are located on a shared drive, users editing the same files can override each other on Save.
- Synchronization issues and overwrites could also happen when using a database connector.
- Does not require use of SHACL to work with instance data.
- TBC lets users make any statements about RDF resources – even if there is no schema that defines what statements are appropriate or expected for a given type of resource.
- Leaves management of identity of graphs to the user.
- When a user creates a new graph, they must tell TBC what URI to use for the new graph. Users can change this URI post creation.
- Users could also have in TBC workspace multiple graph’s with the same URI. One of them will be selected as the definitive graph for a given URI and users can change this.
With these differences understood, let’s consider the steps needed to port your RDF data and models from TBC and/or any other RDF editing tool to TopBraid EDG. Prior to loading RDF data you will need to decide how to organize it in EDG – how many asset collections you will have, what each one will contain, etc.
Since EDG separates models from data, RDF files you will load into EDG should also follow this separation. Especially important is that files with instance data do not contain schema definitions. If they do, then:
- One option is to first load them into EDG ontologies. Then, in EDG, move instance data into another type of collection. Move is available under the Transform tab.
- Alternatively, depending on the size and complexity of your data, it may be simpler to pre-process the files to enforce this separation.
This assumes that your data is using custom ontologies. If it based on one of the ontologies that are already pre-build with EDG, then you only need to be concerned about loading instance data.
If you have multiple RDF files with the model information and/or multiple RDF files with instance data, you may:
- Use the same partitioning approach in EDG i.e., create a separate EDG asset collection for each file.
- Or you may want to combine some of the files into a single asset collection in EDG.
If you do the former and your files reference each other using owl:imports statements, these statements will no longer work since the base URI of an asset collection in EDG will be different from the base URI of the file you are loading. After you create an asset collection for each file, use Settings>External Graph URI to help EDG transform these owl:imports statements. This must be done prior to loading data.
Loading is typically performed by going to Import tab and selecting Import RDF File. If you will decide to combine multiple files into a single asset collection in EDG, then prior to load, remove any owl:imports statements that cross reference the files you are combining.
Starting with EDG 6.4, you can also use two new options for creating ontologies from files:
- Import TriG file – this will create an ontology for each dataset in a TriG file.
- Create EDG ontologies for existing files – this will look for RDF files in your custom projects in the EDG workspace and create EDG ontology for each.
You may also keep some files as files in EDG workspace, without physically loading their data into an asset collection. This may be an option if you will never change their content in EDG. In other words, for EDG purposes, their content is read only. In this case, you can:
- Simply upload them into EDG workspace in a project of their own.
- Then, create an asset collection in EDG and use Settings>Includes to include these files by reference.
Note that the downside of this approach is that you will not be able to use Search the EDG to find resources in these files. Nor will they be found by the global look up. These features only work for data that is physically a part of EDG asset collections.
Finally, you should make sure that files you are using do not contain owl:imports to graphs outside of EDG’s workspace. All imports must refer only to EDG asset collections or to files in the workspace. External imports will not be resolved since dynamically loading data from an external server may time out and present a security risk.
Once you create ontologies in EDG and load RDF describing your ontology models, you need to make sure that the structure and semantics of your models is defined in SHACL. If ontologies you are working with are defined using RDFS and OWL, EDG can auto-generate SHACL for you. Access to this feature is available under the Transform tab.