Questions about SHACL
Q1: Can shapes be generated?
Yes, TopBraid offers a way to suggest shapes based on data samples. Structured data models can also be used to generate shapes e.g., RDF/OWL models, XML Schema, UML, etc.
Q2: Are you aware of any tools available that can convert an ontology into a corresponding SHACL shape graph for validation purposes in a fully or semi-automated manner?
Yes, this is supported in TopBraid. See for example TopBraid Composer 5.4 (Model > Convert OWL/RDFS to SHACL).
Q3: Could you elaborate the difference between ECA rules and SHACL rules?
ECA or Event Condition Action framework is part of TopBraid EDG and TopBraid EVN. It is used to detect changes in data and perform some specified actions in response. For example, one use of it is to update Solr index each time a label is changed.
SHACL rules are used to “infer” new triples from those stated in a graph. For example, compute age based on date of birth. They can either be expressed in SPARQL, JavaScript, or, in case of Triple Rules, a dedicated high-level expression vocabulary in pure RDF. They can be thought as similar to ECA rules, if you consider inferred triples as the “Action”. However, with the ECA rules, action can be anything e.g., a call to some external service.
Unlike with the ECA rules, there is no dedicated “trigger” or “Event” mechanism defined by the standard – this needs to be defined by the application. The “Condition” part is the pattern (“WHERE clause”) that needs to be present in the graph.
Q4: Does the SHACL vocabulary support validating joint uniqueness constraints? e.g. the combination of, say, 3 property values is unique.
Not the SHACL Core vocabulary, but you can use SHACL SPARQL for this. You can also extend the vocabulary through this method and create, for example, my:UniquePropertyValue constraint component which will take as parameters three predicates and check if the combined value is unique. This way, it can be used more generically for any 3 properties as opposed to some specific properties that you would put into a SPARQL query, and the approach is declarative and easy to use for other people without SPARQL skills.
Q5: Can we have the following constraint\; If we have a People class, and property “eat”, and object “Fruits”. We want to be able to say “People eat Fruit”, with the constraint being that each person can eat as many kinds of fruits as they want, but for each kind, they can only eat one. For example, a person can eat 1 apple, 1 orange, etc., but at most 1 for each kind. Are we able to define this kind of constraint or rule? Thanks.
Yes, if you enumerate the fruits in advance then this is supported through qualified value shapes and qualified max count.Please see.
For a more general solution, where the fruits are unknown beforehand, you may want to use SHACL-SPARQL (or SHACL-JS).
Q6: The blank node approach seems to be more flexible than the separate property approach. The latter forces the property to be used the same way each time. Can you comment?
No, giving a property shape URI does not force the property to be used the same way. You can create multiple different property shapes and incorporate them into different node shapes. Whether they are blank nodes or not is not a factor in this.
Property shapes do not work like global constraints in OWL. A scope of a shape (what data it is applicable to) is determined by target statements. Without a target statement, a shape is not applicable to any data. Property shapes can declare their targets – or not. When a property shape is used by a node shape, it always gets its targets passed from a node shape.
The downside of using blank nodes is that a blank node has no identity outside a graph it is contained in. This means, for example, that another file cannot add properties such as sh:deactivated to it.
Q7: Also, if statements are inferred with SHACL rules, (how) are they persisted?
As with SPIN, these inferences could be dynamic or they could be materialized into a graph of your choice.
Q8: Does the invocation of SHACL constraints in TBC work the same way as with OWL, by using the constraint violation button?
Yes that is one choice, to activate on-the-fly constraint checking on the forms. TBC also has a SHACL Validation view with more details, a repair feature etc. There are also various procedural methods to invoke validation, including SPARQLMotion modules, services and web UIs to be used in the server deployments.
Q9: Is the shape deactivate feature part of SHACL, or is it a feature of your tool?
This is a feature of the standard. See more details here.
Q10: Are the SPARQL constraints part of SHACL, or part of your tool only?
SPARQL constraints are part of the standard. See more details here.
Q11: Could you describe the status of the SHACL API work going on? How mature is the API at this point?
The TopBraid SHACL API is fairly mature and used in many projects. It is fully open source, with version 1.0.1 recently released to Maven Central. The API is also used in the TopBraid product, and thus exposed to incremental improvements and permanent testing. The TopBraid product line includes additional features and some optimizations not part of the open source API. Note there is a sibling SHACL-JS API for JavaScript.
Q12: Can you elaborate on the difference between Constraints and Rules in SHACL?
Constraints are declarative statements that are typically used to produce validation results (constraint violations), e.g. the fact that a person has more than two parents. Constraints can also be used for other purposes such as driving user interfaces (e.g. to restrict an input area to only two values). Rules are used to derive (or “infer”) new information from data that is given as input to the SHACL rules engine. In RDF terms, rules can add new triples to the data. For example, a rule may compute the grand parents of a given person by following up the parent relationship, or do mathematical calculations of the area of a rectangle.
Q13: What commercially available products are available that approach the functionality of SHACL?
As of September 2017, we believe TopBraid Suite to be the only commercial product to offer SHACL support. There are several open source implementations. We do not know the extent to which they may offer commercial support. A few vendors said they would shortly offer implementations.
Keep in mind that SHACL is still very new – it only became a recommendation at the end of July, 2017. It was designed to make it easy for vendors that already support SPARQL to support SHACL. With this we expect that, as long as users ask for it, a good number of vendors will offer it. We would encourage you to request SHACL support from your preferred vendor.
For information on the support known to be available, see:
implementations list on the W3C website.
For information on SHACL support in TopBraid, see:
- Using TopBraid as a data validation server.
- Working with SHACL Shapes in TopBraid EDG and TopBraid EVN.
- Working with SHACL Shapes in TopBraid Composer.
Q14: Could this be used for data modeling for different data platforms including relational databases, graph databases, hadoop…?
This question is quite complex and would be best handled in a dialog/discussion, e.g. on the SHACL mailing lists or community group. The data model behind SHACL is more flexible than either relational database schemas or most graph databases. It should however be possible to define subsets of SHACL for specific target scenarios. Also, relational databases can be mapped into RDF graphs, see R2RML or D2RQ, which means that SHACL constraints can in principle be executed directly on databases.
SHACL can be used as a schema language for JSON, in particular via JSON-LD. In TopBraid EDG we are using SHACL to generate Avro schemas from relational databases.
Q15: Can you tell us more about SHACL properties specifically for supporting user interfaces?
In addition to the constraint properties, such as sh:datatype and sh:maxCount, you can specify default values for a property, order of properties on a form, sub-groups on a form layout and a few more things that are useful for the UI. See more details here.
Q16: What are the limits on extensibility? Are there shapes or patterns that cannot be specified?
No known limits. Out of the box, SHACL supports SPARQL and JavaScript, and the latter, in particular, is a general-purpose programming language. Vocabularies for other extension languages would be easy to add. SHACL has been designed with this openness in mind.
Q17: With regard to strings, is there support for language tags?
You can use rdf:langString as value of sh:datatype to test if value nodes have a language tag. Additionally:
- sh:languageIn lets you specify the allowed list of language tags for a given value node.
- sh:uniqueLang can be set to true to specify that no pair of value nodes may use the same language tag.
And you can use SPARQL constraints to do more with language tags. See examples here.
Q18: Should a SHACL implementation be expected to be aware of all the rdfs:subClassOf and rdf:type relations defined in the vocabularies used by the data graph? (This would be helpful to avoid having to explicitly append all of those relationships defined in the vocabularies to the data graph being tested or making the shape graph excessively bloated)
Yes, these statements should be included in the data graphs. However, you do not need to append them, you can refer to the graphs that contain these statements using owl:imports.
Q19: Are you aware of any tools available that can convert an ontology into a corresponding SHACL shape graph for validation purposes in a fully or semi-automated manner?
Yes, TopBraid Composer (TBC) offers a convertor. Open a file you want to convert and select the convertor under the Model menu in TBC.
The initial version of the convertor became available in TopBraid Release 5.3.2, but we recommend waiting for the upcoming 5.4 Release since it offers a more mature and extended convertor. You should be able to get the beta of 5.4 by early October, 2017. If you want to see a preview and help us fine tune the converter, feel free to send us your OWL ontologies.
Q20: Can you talk a little about JavaScript and SHACL? Can you export validations for use by web clients, prior to submitting data to a server? How is that set up?
Yes it is very flexible. Everything is declarative. Basically the so-called shapes graph contains the pointers to the JavaScript files (URLs) and the names of the JS functions that should be called when the corresponding constraint is used. Data graphs may point at one or more shapes graphs, and fully SHACL-compliant engines will resolve the URLs dynamically. Needless to say, the client may cache and preload the JS.
Q21: Why does the sh:pattern “^http://…” start with “^”?
In regular expressions, the ^ symbol serves as an anchor for the start of the string. So for example, “testhttp://” would not match “^http”, but http://test would.
Q22: Can I specify that the weight of every dog must be smaller than the weight of any person?
Yes, with SPARQL-based constraints you can. See this tutorial on creating your own constraint components. In this case, you will probably use as a target all subjects of ex:weight property that are of rdf:type ex:Dog.
When both properties you want to compare are attached to the same focus node (e.g. a passenger may take a dog with them on board of an aircraft, but it must weigh less than a passenger: ex:Passenger1 ex:personWeight X and ex:Passenger1 ex:dogWeight Y) then you could use sh:lessThan to specify constraints for these two properties.
Q23: You mentioned that SHACL can be used for data integration. Can you please elaborate more on this as to whether we can use SHACL to integrate data from data silos?
This is a complex question that may benefit from examples and a longer discussion depending on your specific use cases. Scratching the surface, let’s assume you have two different databases, one about persons and one about customers. You could define two ontologies for them, with shared superclasses and/or shared properties, and we assume there is a way to turn your data into instances of these classes. Then you could use SHACL to define constraints that go across those silos, because ontologies can be mixed together, e.g. by matching by full name or social security numbers. For mapping from one RDF ontology to another, SHACL includes rules, either written in SPARQL (CONSTRUCT) or via SHACL Node Expressions. The latter has been particularly designed for mapping scenarios and visual diagrams. SHACL rules can use SHACL constraints/shapes to limit the preconditions and select specific instances to which the rules apply. Visit this page.
Q24: The shapes creation demo shows implicit class target. Does the TopBraid Enterprise Data Governance tool support explicit class target when creating a shape?
The EDG Ontology editor is class-centric and optimized for that design. However, in the upcoming TopBraid Release 5.4, you can in also create stand-alone node shapes and edit their sh:targetClass etc individually.
Q25: What are the choices of SHACL engines?
Please see the answer to Q13 above about commercial support
Q26: How do we reference one/multiple ontologies in the SHACL?
SHACL supports owl:imports statements. You can use it to reference ontologies. These statements can be included in shapes graph and in data graphs. If you will be relying on rdfs:subClassOf statements for targeting and these statements are in an ontology, you need to make sure to reference it from your data graph.
Q27: Is the constraints violations report also in RDF?
Yes, it is in RDF. Learn more on its structure here.
Q28: What levels of SHACL support are there in TBC versions, e.g. 5.2 or 5.3?
TBC 5.3.2 offers full support as it was released at the time the standard was finalized. TBC 5.4, as stated above, will offer more capabilities (e.g., auto-conversion of OWL), but these are convenience features.
Versions prior to 5.3.2 tracked the standard development process, so their SHACL support will be somewhat different from the final version.
Q29: Could this be extended to do data modeling outside the semantic web space? Modeling and validation of data exchange formats (e.g., xml, json, json-ld, edn, …)
JSON-LD is RDF, so works natively. Many other JSON models can be turned into JSON-LD using a suitable @context. XML structures have a straight-forward mapping to RDF.
Q30: Is there any antipatterns to be aware of? Where are discussions of best practices happening?
The SHACL syntax offers more than one way to express the same things. It is up to the target audience to decide which syntactic structures are not desirable given their use cases.
For example, a property shape can be specified in-line in a node shape as a blank node. We typically recommend giving it a URI, so it can be properly referred to across graphs. Another example is that in addition to property shapes to be referenced in the node shapes and used with targets specified in the node shapes, property shapes can also be used as top-level entities with their own targets. However, most other modeling languages use something like classes that have properties. So if the goal is to be close to those other modeling languages and target platforms, then stand-alone property shapes with their own targets may be an anti-pattern.
The SHACL Community Group is the right forum for discussing best practices. At this point the current charter of the Working Group has been fulfilled. It has been extended through next spring, but only so that we can deal with any issues that may come from the user community. Any new work in the near future will be happening in the Community Group, leading to potentially chartering a new Working Group for SHACL 1.1 or SHACL 2.0.
The Community Group is also a better forum because anyone can join it while Working Group participation is limited to only W3C member organizations.
Q31: And are the JS validations generated by the SHACL structures or do you have to carry explicit JS in the structure?
SHACL-JS engines produce the same validation results (in an RDF vocabulary that can be represented as JSON-LD) as the other variants, SHACL Core and SHACL-SPARQL. The SHACL-JS validators don’t have to know about the details and may, for example, just return the string of an error message. The SHACL-JS shapes graphs do not carry JS code but point at JS functions stored in .js files. Not sure if this answers your question.
Q32: The standard says that SHACL has no formal semantics. Is this still the case?structures or do you have to carry explicit JS in the structure?
SHACL definitely has formal semantics. These are written in the spec as TEXTUAL DEFINITIONS and sometimes SPARQL. The interpretation is unambiguous and standardized across platforms. A limited number of features (such as support for recursion) were left undefined (meaning that implementers can either not support the feature or create their own definition) because the SHACL WG could not agree on semantics for that during the time of the working group.
Q33: Are popular triple stores such as GraphDb, Allegrograph compatible with Shacl-data?
Yes, SHACL is RDF so it is fully compatible with any RDF store. SHACL processors can operate on any RDF database. If the database provides a Jena adapter then the TopBraid SHACL API can be used directly. The performance of those may, of course, not be ideal so you may want to inquire with your preferred vendor concerning their plans to support SHACL natively.
Q34: You mentioned support for other languages- does that include C#?
The only currently defined extension languages are SPARQL and JavaScript. C# or Java would be easy to define (e.g. mysh:javaFunction/Class) but such solutions may then require precompiled libraries.
Q35: Any tools to leverage existing XML schema (xsd) by smartly importing it?
TopBraid products do provide importers from XML Schema or just XML (instance) files to RDF. The XSD import creates RDFS and OWL statements which can then be converted to SHACL. One step translation from XML Schema to SHACL (constraints) is planned.
Q36: How will this fit into linked data?
The whole architecture of SHACL is very linked data friendly. SHACL shapes graphs are represented as RDF and therefore can be shared on the web as linked data. Data graphs or instances can reference the shapes that they are supposed to obey, e.g. using the sh:shapesGraph property. Finally, even the constraint components (definitions of the constraint types such as sh:minCount) are represented in RDF, so that if someone defines new constraints or other extensions, then these can also be looked up on the web.
Q37: Can you say more about SHACL support URI generation?
SHACL constraints and rules can use SPARQL, e.g. with built-in functions such as IRI and CONCAT. This means that SHACL rules can construct new URIs on the fly.
Q38: Does that mean rules can involve multiple fields while constraint is single field?
The difference between rules and constraints is that rules create new triples. Constraints can work with multiple fields. For example sh:lessThan can be used to relate ex:startDate and ex:endDate. Also, shape definitions can group together multiple property shapes, and then these shape definitions can be referenced with sh:node, sh:property, sh:qualifiedValueShape etc. Finally, SHACL-SPARQL or SHACL-JS can express almost arbitrary relations including those that look up values elsewhere and compare them with local values, or walk property path expressions.
Q39: You mentioned that the community may take on other features/work–what is the governance mechanism of the standard?
The governing mechanism is the W3C process. More details here.
Q40: I really don’t see how rules can be related with SHACL, can you elaborate on this a little?
It is not surprising that there is interest in rules. However, it would be too ambitious to cover this in the intro webinar. We are planning to present follow up webinars and will include one on rules. In the meantime, you can: learn more about SHACL rules here.