(original posted on September 6, 2016)
What is SHACL?
The RDF Shapes Constraint Language (SHACL) is a new W3C standard for specifying rich data constraints as part of semantic data models. It can be used with RDFS/OWL ontologies or, alternatively, an entire model can be described only in terms of SHACL shapes. Think of SHACL as doing for RDF data what XML Schemas do for XML data.
Sitting on top of the RDF linked data model, SHACL provides a very flexible architecture for representing domain models, knowledge bases and other data assets. While RDF Schema alone can only be used to represent very basic facts about classes attributes and relationships, and is targeted towards inferencing, SHACL adds a rich declarative language for representing constraints that may be used to validate data, to drive user interfaces, to render complex data structures or to define the contracts of web services.
How can I start using SHACL?
SHACL is currently a W3C Candidate Recommendation. The syntax of the language is stable and ready to use. It is not expected to change. Several implementers are currently going through the testing stage with the final release of SHACL as a W3C Technical Recommendation projected to happen as early as June 2017.
TopQuadrant, as a driving member of the SHACL working group, has included SHACL support for advanced users in its TopBraid Composer IDE since release 5.1. As of release 5.2, TopQuadrant’s web information governance solution TopBraid EDG also includes full support for SHACL.
If you are already a user of EDG, you can follow along with our short tutorial. If you are not a user yet, not a problem! You can download a trial version of TopBraid Composer Maestro Edition which includes a localhost demo of EDG. Or you can request a server evaluation account.
TopBraid EDG offers SHACL editing support as part of the Ontology Editor. Any newly created Ontology is automatically enabled to include SHACL constraints. In this tutorial we will describe how to create SHACL Shapes that are tied to classes. You can also create Node Shapes that are no classes.
If you prefer TopBraid Composer’s capabilities to the Web UI or if you already created shapes and want to use them in EDG, simply use Composer or any other tool with full SHACL support, and import the resulting SHACL file into an EDG Ontology.
Step by Step Tutorial
So let’s get going with a new Ontology:
To start editing, click on the Ontology tab. Then, click on the golden globe icon to Create class…
Once a class is created, creating properties is done by using the corresponding buttons in the Property Groups panel located below the class hierarchy. Alternatively, you can also create them by clicking the + icon on the Person class form, located next to the ‘declared properties’ field.
For now, let’s create an attribute / datatype property for the social security number (SSN):
At this stage, you have the option to select the most commonly used constraints on the property values. In the case of the SSN property, we set the datatype to string, cardinality to Exactly One and press OK.
This creates a SHACL property shape and attaches it to the example class. Property shapes are typically attached to a SHACL node shape which aggregates all the “rules” we will define across the various properties. SHACL allows classes to be also node shapes. SHACL has a concept of targets to specify what resources a SHACL shape applies to. When a class is used as a node shape, it means that the shape applies to all class members.
When you create a class in TopBraid EDG, it is automatically declared to also be a SHACL node shape so that it define the permissible values for resources that are members of a class. In our case, the class Person.
Let’s click on the SSN property to select it. From the drop down in the Settings menu on the form, make sure to select Also show properties that have no values. This will let you see all available SHACL constraints, including those that do not yet have values.
We can now fill in additional parameters to further constrain the values of the social security number. The screenshot above shows the upper portion of a property shape, where we can specify how the property should be displayed in the context of the surrounding class – properties may have different display names depending on where they are used. There is a long list of options here – you can scroll down the form and, if desired collapse some sections, to see more.
Even more are likely to be supported in future versions since SHACL is extensible with all kinds of new constraint types. If a constraint option is not available on a form, it can be entered in the Source Code panel nested behind the form.
In the lower part of the property shape form, we have a variety of SHACL features available to limit the value type, value range and various string characteristics. In our example, we can use a regular expression (^\d{3}-?\d{2}-?\d{4}$) to state that social security numbers consist of 3-dash-2-dash-4 digits, and (to be less geeky) limit min length and max length of the property to 11 characters. Click on the Edit button to make these changes. Don’t forget Save Changes.
For the sake of making this example a tiny bit more interesting, let’s create a relationship / object property to represent the children.
I’ll leave this as an exercise, but assume we want to state that the child property is non-recursive, because no person may have him or herself as child or grandchild. (The non-recursive constraint type is from the DASH extension namespace that is shipping with TopBraid).
Now that we have some properties and “interesting” constraints, let’s create an example Instance of Person and experience data validation: To do this, we will create a Data Graph in TopBraid EDG that will store instances of the classes. We will first need to declare our class as Public. This is done by clicking on the Home icon to navigate to the Ontology resource. This icon is located in the header to the left of the Undo button. Now click on the drop down in the form to select GraphQL Schema.
Declare Person class public. This exposes it to other graphs (asset collections) in EDG.
Now, create a Data Graph asset collection: To do this, we can click on the + button in the header or use the left side Navigator to go to the Data Graphs page. In the create dialog, call the new asset collection People and base it on our ontology by selecting it in the Includes.
Click on Create Data Graph. Now click on the Data Graph tab to start creating people. In the Search panel, select Person class.
Hint: You could also use Manage tab to make Person the main entity of this graph. This way your focus class will be preselected every time you enter this asset collection.
Use the New button to create a person. In the example Person below, we have made a mistake (can you spot it?).
Attempting to Save Changes performs the SHACL validation for us. If there are any issues you will get the below warning. You can select a check box to not show it again. Click OK.
If a submitted form contains errors such as above, TopBraid will display details about which property and value is involved. For some constraint violations, TopBraid even attempts to find a fix and offers Suggestions. For example, in the case of the social security number that got a bit too long, TopBraid’s SHACL engine suggests to either delete the offending value, or make it shorter so that the max length constraint gets fulfilled again. See the DASH Suggestions Vocabulary for details on how this mechanism works internally, and how to extend it. Press Apply to accept the suggestion before confirming the edit with Save Changes.
If you still see a message in the form’s header with some number of errors, click on the Refresh icon next to it. It will go away if errors no longer exist.
To get an overview of all constraint violations across all resources, use Problems and Suggestions panel from the Panels drop down menu in the header. You can drag and drop it on your page. Such reports offer a convenient place to examine and fix multiple errors at once, e.g. when a new constraint has been added and pre-existing data may not conform to it.
SHACL and your existing ontologies
SHACL can of course be used to enrich your existing ontologies. The following screenshots illustrate some constraints that have been defined for the schema.org data model and the corresponding constraint violations report for some example instances:
In order to apply SHACL on top of an existing data model, simply create an Ontology that includes (owl:imports) the classes and properties that you want to constrain. If this ontology is defined using RDFS/OWL you can use the operation available on the Data Graph tab to Convert it to SHACL.
This short walk-through illustrates some of the ways that you can use SHACL in TopBraid EDG.
The resulting Ontologies can be included into other EDG asset collections, or you can export them to a Turtle, XML or JSON-LD file. In either case, you can use these shapes to run data validation not only at edit time as demonstrated here, but also in a batch mode by utilizing SHACL validation RESTFull API provided in TopBraid.