This document is part of the TopQuadrant GraphQL Technology Pages
This document describes the features of the TopBraid GraphQL services. TopBraid products automatically generate GraphQL schemas from existing SHACL shape definitions, as explained in Publishing RDF/SHACL Graphs as GraphQL. The generated GraphQL schemas include built-in facilities to filter data using direct value matching, complex query patterns and even SPARQL expressions. Furthermore, the schemas define aggregation, ordering and paging of results, as well as dynamically deriving values for the JSON response. Using the schemas, implementations can convert RDF graph data into JSON object structures (and back).
TopBraid takes SHACL shape definitions or GraphQL schemas as input and generates an enhanced GraphQL schema that provide numerous features to query data stored in an RDF dataset. In this document we take the viewpoint of a typical GraphQL user, such as a UI developer or data analyst, to explain which features are available. Our starting point is a GraphQL schema:
type Human { id: ID! name: String! height: Float friends: [Human] }
The processor will internally convert it to SHACL and then generate the following GraphQL schema:
schema { query: RootRDFQuery } type RootRDFQuery { humans (... filters etc, see later...): [Human] ... generated fields for aggregations and introspection ... } type Human { uri: ID! label: String! id (... filters etc...): ID! name (... filters etc...): String! height (... filters etc...): Float friends (... filters etc...): [Human] ... generated fields for aggregations, derived values ... }
As shown above, the system automatically produces a root query object that has fields for every public shape, with a name that is basically the plural form of the shape name. These root query fields can take a large number of arguments to select which of the matching objects shall be returned, but we get to that later.
Each object type has a field uri
storing the URI of the underlying RDF resource
and label
to query a human-readable display label.
Completing this introductory example, here is an example GraphQL query against this schema, returning all humans where the name starts with L, and all their friends, translating the height from meters to feet.
{ humans (where: {name:{pattern:"^L"}}, orderBy: name) { id name height (transform: "$height / 0.3048") friends { id name } } }
A possible result JSON would be:
{ "data": { "humans": [ { "id": "1003", "name": "Leia Organa", "height": 4.921259842519685, "friends": [ { "id": "1002", "name": "Han Solo" }, { "id": "1000", "name": "Luke Skywalker" } ] }, ... ] } }
Most fields that are generated by the engine can take arguments to filter out which values to return. The various kinds of filters are described in the next sub-sections.
It is a common design pattern in GraphQL to allow filtering by direct value matches, e.g. give me all humans that appeared in "JEDI":
{ humans(appearsIn: "JEDI") { name } }
The processor produces one such argument for each field that is derived from the property shapes. This is done for the top-level query fields and the object-valued fields of each generated object type. The values of these arguments must be scalar JSON values. In order to match an object-valued property, use the URIs of the values as ID strings.
Use uri: "..."
to only return exactly the object with the given URI.
The where
argument is an expressive way to filter values based on constraints
similar to SHACL.
It is available for all object-valued declared fields including the root query fields.
The values of where
are input objects with internal names such as Human_where
and fields for each declared field of the type that is constrained.
This is best explained by means of an example. The following query returns all humans where at least one starship exists that has a length greater than or equal to 30 units:
{ humans(where: { starships: { exists: { length: { minInclusive: 30 } } } }) { name height homePlanet starships { name length } } }
The following types of constraints are (currently) supported.
Parameter | Type | Condition |
---|---|---|
hasValue |
Same as field | Object must have exactly the given value (plus maybe others) |
minCount |
Int |
Object must have at least minCount values |
maxCount |
Int |
Object must have at most maxCount values |
minExclusive |
Same as field | Object must have a value so that value > minExclusive |
minInclusive |
Same as field | Object must have a value so that value >= minInclusive |
maxInclusive |
Same as field | Object must have a value so that value <= maxInclusive |
maxExclusive |
Same as field | Object must have a value so that value < maxExclusive |
pattern |
String |
Object must have a value that matches the given regular expression |
flags |
String |
Optional flags for the pattern regex engine, such as "i" to ignore case |
exists |
Nested object | One of the values must conform to all nested constraints |
Object-valued fields can use the argument queryText
to allow free-text search
across the datatype values of any declared field of the object type.
The values of queryText
are interpreted as regular expressions based on SPARQL's
regex
operator and ignoring case.
The following example produces all humans where any scalar field (such as name
,
height
or homePlanet
) has a value starting with L:
{ humans (queryText: "^l") { name } }
As the ultimate fallback with a maximum of expressivity, any field can take the argument
filter
, the values of which must be valid SPARQL FILTER expressions.
When these SPARQL expressions are evaluated, certain variables have pre-defined values.
The variables $label
and $uri
have the corresponding field values,
and the special variable $this
refers to the current query object (RDF resource).
Furthermore, for each declared field of the object type that is the value type of the queried field,
a corresponding variable will hold the value of the object that is matched.
For example, the following query retrieves all humans that are at least 1.6 units high:
{ humans (filter: "$height > 1.6") { name } }
Note that these pre-bound values are only supported for single-valued properties
(with sh:maxCount 1
) because the system would otherwise need to pick a "random" value
from the underlying database. In order to query values of multi-valued properties,
use SPARQL EXISTS
or NOT EXISTS
expressions.
As of TopBraid 6.1 this feature is deactivated by default (to reduce the complexity of the generated GraphQL schema). If needed, enable it using the Server Configuration Parameters page, or use Property Value Rules to expose selected kinds of aggregations.
Most database query languages support some form of aggregations, typically including COUNT
,
SUM
, MIN
, MAX
, AVG
and MEDIAN
.
Another form of aggregations is to build strings by concatenating multiple values.
In this GraphQL implementation, only COUNT
and CONCAT
are supported for now.
(Future versions may support SUM, MIN, MAX, AVG and MEDIAN if there is demand - let us know).
Aggregation fields are available for multi-valued fields only.
Each multi-valued field is accompanied by a field named xyz_COUNT
, producing
an Int
result for the number of values.
COUNT fields can take the same filter arguments as the other fields, making it possible to
count only certain values.
The following query returns the number of humans that have at least 4 friends:
{ humans_COUNT (where: {friends: {minCount: 4}}) }
Each multi-valued field is accompanied by a field named xyz_CONCAT
, producing
a String
result by concatenating all matching values of the field.
An empty string is delivered if there are no matching values.
_CONCAT
fields take an optional argument separator
if the string should
use something else than the default ", "
between sub-strings.
The following query returns, for each human, the name and a single string consisting of
all friends starting with L, separated by " and "
:
{ humans { label friends_CONCAT(where:{name:{pattern:"^L"}}, separator: " and ", orderBy: label) } }
An example response is:
{ "data": { "humans": [ { "label": "Character-1000", "friends_CONCAT": "Character-1003" }, { "label": "Character-1002", "friends_CONCAT": "Character-1000 and Character-1003" }, { "label": "Character-1003", "friends_CONCAT": "Character-1000" } ] } }
The _CONCAT
fields also support the orderBy
arguments.
If the concatenated values are objects, then each object will be represented by its label
by default.
To construct strings from other values, use _CONCAT
with the argument
labelExpr
which takes a SPARQL expression string as its value.
In this expression you can access the same pre-bound variables as for
filter
expressions.
The following query would return a concatenation of the lower-case names of all friends.
{ humans { label friends_CONCAT (labelExpr: "LCASE($name)", orderBy: name) } }
The labelExpr
argument can also be used for scalar fields, in which case
the variable with the name of the underlying field is holding the literal value.
By default, RDF triples are unsorted. This section explains how a certain order can be accomplished, and how to page through large numbers of items.
orderBy
, orderByDesc
and orderByExpr
Any multi-valued field can take the argument orderBy
to specify the
order of values in the JSON array.
The values of orderBy
are the names of the fields (from an enumeration)
of the object type, including label
and uri
.
The values will be sorted in ascending order unless the argument orderByDesc
is set to true
, as shown in the following example.
{ humans (orderBy: height, orderByDesc: true) { name height } }
The optional argument orderByExpr
can take a SPARQL expression as its value.
If present, then each value will be run through the expression before being compared.
In those expressions, the variable $value
can be used to access the
current value.
For example, use LCASE($value)
to order all values in lower-case form.
first
, skip
and orderAll
If results are returned in order, then the arguments first
and skip
can be used to page through results. Both are taking integers.
first
states the maximum number of results that shall be produced
and skip
is the offset (starting at 0).
The following example returns the 3rd page of 10 humans, ordered by names.
{ humans(orderBy: name, first: 10, skip: 20) { name height } }
If orderBy
is combined with first
, the engine will by default first collect a
random set of "first" values and sort those.
However, if you also specify orderAll: true
then the engine will first walk through all available
values and apply the sorting there, and then return the first values of that wholly sorted list.
These are more reliable results.
By leaving out orderAll
you can ensure that the system is not running into worst case performance
as sorting many thousands of values might become quite slow, and often it is enough to merely show any reasonable subset.
This feature requires TopBraid 6.2 onwards.
By default, rdf:Lists are treated like any other RDF value, and the recursive structure of list
nodes has no special support by GraphQL.
If you want rdf:Lists to be handled like JSON arrays (multi-valued GraphQL fields) then you need
to make sure that TopBraid can understand the intention and knows what type the list member have.
The supported design pattern is described in this blog post.
In particular, the sh:node
of your property shape needs to be dash:ListShape
:
ex:PlayList a sh:NodeShape, rdfs:Class ; sh:property [ sh:path ex:songs ; sh:maxCount 1 ; sh:node dash:ListShape ; sh:property [ sh:path ( [ sh:zeroOrMorePath rdf:rest ] rdf:first ) ; sh:class ex:Song ; ] ] .
Note that the sh:maxCount 1
is recommended to clarify that there can only be one list per subject.
Cases with multiple rdf:Lists
are not supported.
Each field with a scalar value type can take an argument transform
that takes a
SPARQL expression string as its value.
If a transform
is present, the values delivered for the field are passed into
the expression (as a variable with the same name as the field), and the result of the evaluation
will be returned instead of the original value.
In these expressions, the values of any other single-valued field can also be accessed as
named variables, as can $label
and $uri
, as well as $this
to access the surrounding resource.
In the following example, the height of each human is converted from meters to feet, and the name is returned in all upper-case letters.
{ humans { name (transform: "UCASE($name)") height (transform: "$height * 3.28084") } }
As of TopBraid 6.1 this feature is deactivated by default (to reduce the complexity of the generated GraphQL schema and make it more difficult to write inefficient queries). If needed, enable it using the Server Configuration Parameters page, or use Property Value Rules to define the kinds of inferences that are supported for a schema.
While transformations can modify how existing values are returned, it is sometimes
useful to compute arbitrary field values dynamically.
Each object type has the following generated fields in addition to the declared fields:
deriveBoolean
, deriveFloat
, deriveInt
and deriveString
.
The value types of these are single values of the corresponding scalar types.
They take a single required argument expr
of type String
that must be
a valid SPARQL expression.
In these expressions, the values of other properties of the underlying RDF resource can be queried
as pre-bound variables.
Furthermore, the variables $this
, $label
and $uri
are pre-bound.
The result of the expression will be returned in the specified scalar type.
It is common to use GraphQL aliases so that the names in the generated JSON will not start with "derive". The following example delivers the length of each human's name as a separate integer field:
{ humans { name nameLength: deriveInt(expr: "STRLEN($name)") } }
Producing for example:
{ "data": { "humans": [ { "name": "Luke Skywalker", "nameLength": 14 }, { "name": "Han Solo", "nameLength": 8 }, { "name": "Leia Organa", "nameLength": 11 } ] } }
In addition to these single-valued deriveXY
fields, there is also the special field
deriveStrings
that can be used to generate a list of String values.
This takes a single required argument query
that must be a SPARQL SELECT query
producing a single result variable.
The query will be executed with the variable $this
pre-bound to the current context resource.
{ humans { name friendNames: deriveStrings (query: "SELECT ?name { $this starwars:friends/starwars:name ?name } ORDER BY ?name") } }
Producing for example:
{ "data": { "humans": [ { "name": "Luke Skywalker", "friendNames": [ "C-3PO", "Han Solo", "Leia Organa", "R2-D2" ] }, ... ] } }
Note that the June 2018 version of the GraphQL spec introduces multi-line strings using
the """
syntax, making complex SPARQL queries more readable.
Any standards-compliant GraphQL implementation supports some built-in introspection query types that provide information about the available object and field types. However, these generic capabilities would not return the details of the richer RDF/SHACL model underneath, and would also include information about the various automatically generated objects and fields that are often not of interest to introspecting clients. The alternative introspection query types from this section are available to provide a view on the underlying SHACL structure.
The following example illustrates a typical use case, returning information that may be used to populate an input form for objects matching a given type:
{ _typeShapeByName(name: "Human") { groups { label fields { label } } } }
Here is the shape introspection schema:
type RootRDFQuery { ... _typeShapeByName(name: String!): _TypeShape _typeShapeByURI(uri: String!): _TypeShape _typeShapesForResource(uri: String!): [_TypeShape] } # Metadata about a type as derived from its shape(s). type _TypeShape { uri: String! name: String! label: String! rootQueryField: String fields: [_FieldShape] fieldByName (name: String!): _FieldShape groups: [_FieldGroup] } # Metadata about a field group. type _FieldGroup { uri: String! label: String! order: Float! fields: [_FieldShape] } # Metadata about a field as derived from its shape(s). type _FieldShape { name: String! label: String! datatype: _Resource externalType: _Resource typeShape: _TypeShape scalar: Boolean! unionTypeNames: [String] minCount: Int! maxCount: Int min: String max: String isMinExclusive: Boolean! isMaxExclusive: Boolean! minLength: Int! maxLength: Int pattern: String group: _FieldGroup order: Float! }
This information can be combined with follow-up queries using the standard introspection schema, for example to find the enumerated values of an enum.
Note that the special cases of xsd:string or rdf:langString
and its inverse variation
rdf:langString or xsd:string
(using sh:or
) are mapped to rdfs:Literal
in the _FieldShape.datatype
field.