After two and half years of collaboration and discussions, the W3C Data Shapes Working Group has produced a collection of specifications that define the features of the Shapes Constraint Language (SHACL):
- SHACL (Official W3C Standard Recommendation) is the main document, defining the features of SHACL Core and its extension mechanism called SHACL-SPARQL. SHACL Core defines the basic syntax and structure of shapes, constraints, the built-in kinds of constraints, and how to link shapes to data nodes. SHACL-SPARQL defines how to express constraints that are not covered by the built-in constraint kinds.
- SHACL Advanced Features (WG Note) includes useful features that were originally part of the main document but were split off due to limited time. This includes extensions of SHACL-SPARQL such as user-defined functions but also SHACL Rules, a very powerful feature inspired by SPIN rules, useful to define data transformations, inferences and mappings based on data shapes.
- SHACL JavaScript Extensions (WG Note) defines how JavaScript can be used to express constraints, rules, functions and other features. This covers similar ground as SHACL-SPARQL, but using JavaScript as its execution language.
- A SHACL Compact Syntax exists as a draft that will be finished by the upcoming SHACL Community Group.
Here is a little diagram that gives an overview of how these features plug together:
In a nutshell, a shape can declare targets, which link the shape with certain nodes in the data graph. Often this is simply a statement such as “apply my:PersonShape to all instances of the class schema:Person”, but it is also possible to select the target nodes by means of a SPARQL SELECT query or even a JavaScript function.
A shape serves as a collection of constraints, for example to state that all values of schema:givenName must be string literals. These constraints can be used by tools to either validate any given data, to pro-actively prevent the input of invalid data through guided user interfaces, or also to find all instances that match a certain shape, similar to how search engines work. The design of SHACL supports both built-in constraint types (so-called constraint components) and user-defined constraints written in SPARQL or JavaScript. The beauty of this design is that even the built-in constraint types can be expressed in a declarative way, and their implementations can be published and shared as linked data.
A shape can also have rules. These can be used to derive new information (in RDF: triples) from the existing data. Typical use cases of rules include derived values (e.g. “full name is the concatenation of given name and family name”) or transformations between different data models (e.g. map schema:givenName to ex:firstName). In the current version, SHACL rules come in three flavors: written in SPARQL, JavaScript or so-called triple rules. The latter use the declarative language of node expressions, which has been designed to support graphical editing.
Finally, SHACL functions are a mechanism to encapsulate SPARQL queries or JavaScript functions so that they can serve as reusable building blocks of constraints and rules. For example, if you need a specific calculation (such as converting units of measurement) in multiple places, you can turn it into a SHACL function with a URI and then use this calculation function instead of having to repeat (and remember) the same computation over and over again.
Taken together, these features define an extremely powerful stack of declarative, web-friendly languages that can be used for data modeling, ontology design, constraint validation, inferencing and data transformation. The design of SHACL is heavily inspired by SPIN (which originated from TopQuadrant) and IBM’s Resource Shapes and therefore inherits a wealth of practical experience, solving real-world problems and “getting the job done”.
TopQuadrant has played a key role in the development of SHACL and continues to maintain two reference implementations as open source projects: SHACL API which supports all features mentioned above including a constraint validation engine and a rule engine, and SHACL-JS for a pure JavaScript SHACL validation engine. In addition, the TopBraid product family including TopBraid EVN, EDG and Composer is continuously updated and enhanced to access SHACL features through web services and user interfaces.