This document is part of the TopQuadrant GraphQL Technology Pages
This document (for maintainers of RDF/SHACL data models) explains how RDF graphs can be published through the GraphQL services of TopBraid. In a nutshell, one or more GraphQL schemas are automatically generated using data shape definitions in the Shapes Constraint Language (SHACL). These SHACL shapes may be automatically generated using other input GraphQL schemas, enhancing them in the process with numerous features to query data stored in an RDF dataset. SHACL data shapes can also be generated from other input formats supported by TopQuarant's products. The features available to end users are described in Querying RDF Graphs with GraphQL.
The readers of this document are expected to be familiar with GraphQL and have basic RDF skills. Knowledge of SHACL is advantageous.
This document uses the prefix dash
which represents the namespace http://datashapes.org/dash#
which is accessible via its URL http://datashapes.org/dash
.
The prefix graphql
represents the namespace http://datashapes.org/graphql#
which is accessible via its URL http://datashapes.org/graphql
.
TopBraid products can take a SHACL shapes graph as input and produce an executable GraphQL schema that supports queries against data in RDF graph databases.
The SHACL shape definitions may originate from existing third-party data models, or shapes that were created specifically to expose certain views on existing data. Multiple shape definitions may cover the same underlying RDF graph data. GraphQL schema syntax itself may be used to define shapes, see GraphQL Schemas to RDF/SHACL.
Using SHACL node shape definitions, the processor generates GraphQL object types with fields derived from property shapes. The fields are augmented with various arguments for filtering, aggregations, ordering and paging, transformations and deriving values.
The resulting GraphQL services provide an easy-to-use yet highly flexible and powerful query language to produce JSON views on any RDF-based data such as enterprise knowledge graphs. It also enables the use of GraphQL together with existing RDF technology such as SPARQL, rule engines and SHACL data validation components.
An RDF graph may contain thousands of classes or data shapes.
A GraphQL service that includes all of them at once would quickly become unusable.
In order to instruct the processor on which shapes and classes shall be exposed via GraphQL,
the starting point is an instance of the class graphql:Schema
.
This schema instance must use the following properties to include or exclude shapes:
Property | Description |
---|---|
graphql:publicShape |
The values are included into the GraphQL schema |
graphql:publicClass |
The values and all its subclasses are included |
graphql:publicNamespace |
All shapes from the given namespace are included |
graphql:protectedShape |
The values are included but not available from the root query |
graphql:protectedClass |
The values and all its subclasses are included but not available from the root query |
graphql:privateShape |
The values are excluded from the GraphQL schema (even if published by other properties) |
The algorithm that produces the set of published shapes first collects all
shapes or classes defined using the graphql:publicXY
and graphql:protectedXY
properties above from the schema and also all its transitive values of owl:imports
and
rdf:type
properties.
Then it removes those that are marked via graphql:privateShape
.
All published shapes can be queried via GraphQL and are automatically exposed by the root query object. Those that are marked protected can not be queried from the root query but can be reached and traversed from other object types. Here is an example:
ex:MySchema a graphql:Schema ; graphql:publicShape ex:Human . ex:Human a sh:NodeShape ; sh:property [ sh:path ex:id ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ; sh:order "0"^^xsd:decimal ; graphql:isIDField true ; ] ; sh:property [ sh:path ex:name ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ; sh:order "1"^^xsd:decimal ; ] ; sh:property [ sh:path ex:height ; sh:datatype xsd:decimal ; sh:maxCount 1 ; sh:order "2"^^xsd:decimal ; ] ; sh:property [ sh:path ex:friends ; sh:node ex:Human ; sh:order "3"^^xsd:decimal ; ].
If you have used the automated GraphQL Schemas to RDF/SHACL conversion to produce the SHACL shape above, then the input shape might have looked as follows:
type Human { id: ID! name: String! height: Float friends: [Human] }
The processor will internally generate the following GraphQL schema:
schema { query: RootRDFQuery } type RootRDFQuery { humans (... filters etc, see later...): [Human] ... generated fields for aggregations and introspection ... } type Human { uri: ID! label: String! id (... filters etc...): ID! name (... filters etc...): String! height (... filters etc...): Float friends (... filters etc...): [Human] ... generated fields for aggregations, derived values ... }
As shown above, the system automatically produces a root query object that has fields for every public shape, with a name that is basically the plural form of the shape name. These root query fields can take a large number of arguments to select which of the matching objects shall be returned, but we get to these details in Querying RDF Graphs with GraphQL.
Completing this introductory example, here is an example GraphQL query against this schema, returning all humans where the name starts with L, and all their friends, translating the height from meters to feet.
{ humans (where: {name:{pattern:"^L"}}, orderBy: name) { id name height (transform: "$height / 0.3048") friends { id name } } }
A possible result JSON would be:
{ "data": { "humans": [ { "id": "1003", "name": "Leia Organa", "height": 4.921259842519685, "friends": [ { "id": "1002", "name": "Han Solo" }, { "id": "1000", "name": "Luke Skywalker" } ] }, ... ] } }
For each published node shape in a schema, the processor will create one GraphQL object type as described in the following sections.
The name of this object type will be derived using the following rules (in order):
graphql:name
of the shape.
If there is more than one object type with the same name (e.g. from different namespaces but
with the same local name), then preprend the prefix of the namespace and '_'. For example,
ex:Human
would become ex_Human
.
In general, the mapping is rather strict if the underlying shape definitions are invalid.
For example if no valid name can be produced for a shape then the schema is rejected and
the user encouraged to add suitable graphql:name
triples.
uri
Field
Each generated object type has a built-in field called uri
that can be used to
retrieve the URI of the RDF resource.
For blank nodes this is an internal identifier starting with _:
.
In general, these blank node identifiers can be used interchangeably with URIs.
label
Field
Each generated object type has a built-in field called label
that can be used to
retrieve a human-readable label for an object.
This label is typically derived from the rdfs:label
(or a similar property)
and should use the preferred language of the client, if multi-lingual labels exist.
The label
field always returns something, falling back to the local name of the
underlying RDF resource, or an internal identifier starting with _:
for blank nodes.
The object types produced from a node shape will have one field for each distinct sh:path
that is defined at any property shape of the node shape.
If the node shape is also an rdfs:Class
then this includes any property shape of the
(transitive) superclasses.
Furthermore, any property shapes attached to values of sh:node
of the node shape will
(recursively) be included.
(As a general pattern, rdfs:subClassOf
and sh:node
are treated uniformly,
i.e. sh:node
is an extension and inheritance mechanism similar to subclassing.)
The names of these fields are derived using the same rules as for object types, i.e. checking
graphql:name
first, then local names of the sh:path
(if that's a URI),
and prepend a prefix if duplicate names would exist.
Note that if a property shape is about a complex SHACL path, then a graphql:name
is required.
The type of these generated fields is derived from the sh:datatype
, sh:node
or sh:class
. For example, sh:datatype xsd:boolean
gets mapped to
Boolean
and sh:datatype xsd:decimal
to Float
.
To produce ID
, annotate the property shape with graphql:isIDField true
combined with sh:datatype xsd:string
.
If the property shape defines an sh:or
list with at least one member, and
all members of that list are node shapes with a URI, then a union
type will be
generated automatically.
If the property shape has an sh:or
list that is either xsd:string or rdf:langString
or its inverse variation rdf:langString or xsd:string
then the object type
LangString
(with fields lang
and string
) will be used.
For other sh:or
lists where the first entry is a sh:datatype
shape,
that specified datatype will be used.
For object-valued properties for which there is no matching GraphQL object type, the system falls back to a built-in special type that is defined as follows:
# A resource for which only the URI and label can be queried. type _Resource { uri: ID! label: String! }
This type is for example used for links that typically go outside of the published schema,
e.g. rdf:type
values.
Fields are list-typed unless there is a property shape with sh:maxCount 1
.
Fields are marked as non-nullable (with !
) if there is a sh:minCount 1
.
Note that, in general, any property shape that is marked as sh:deactivated true
is ignored by the processor.
A dataset may contain many named graphs and heterogeneous data.
It is possible to define multiple GraphQL schemas for the same data, and in the same shapes graph.
Each instance of graphql:Schema
can either be identified by its URI or by its
graphql:name
.
This is best explained through an example.
ex:MySchema a graphql:Schema ; graphql:name "starwars" ; graphql:publicShape ex:Human .
The above schema is available through the URL schema [server]/graphql/[dataset]/starwars
.
If a schema does not carry a graphql:name
then it can be accessed via the qname of its
URI, replacing :
with _
:
[server]/graphql/[dataset]/ex_MySchema
would also work.