GraphQL Schemas to RDF/SHACL

This document is part of the TopQuadrant GraphQL Technology Pages

GraphQL includes a schema language for defining object types and fields. In order to use GraphQL with RDF-based technologies like TopBraid Suite (version 6 onwards), we needed to define a mapping from the GraphQL schema language to RDF, in particular to SHACL shapes.

This mapping makes is possible to leverage GraphQL schemas as RDF data models and use them in conjunction with existing RDF models. Further, the mapping enables us to use the RDF technology (graph databases, rule engines, SPARQL and data validation) in conjunction with GraphQL-based applications. The mapping uses syntactic extensions (through GraphQL's directives extension point) to express richer SHACL constraints. This way, GraphQL can also be used as a user-friendly compact syntax for SHACL.

This document uses the prefix dash which represents the namespace http://datashapes.org/dash# which is accessible via its URL http://datashapes.org/dash. The prefix graphql represents the namespace http://datashapes.org/graphql# which is accessible via its URL http://datashapes.org/graphql.

Overview

GraphQL is an increasingly popular language for describing queries and updates using a JSON-based architecture. As the name suggests, GraphQL has been designed for graph-shaped data models consisting of objects that have fields, and fields may hold scalar values (aka literals) or link to other objects. GraphQL's schema language defines a syntax to declare the types of such objects and fields.

Graphs also play a fundamental role in the Semantic Web world built upon the RDF family of languages. RDF introduces the concept of nodes that are either literals or resources with a URI or identified by internal IDs only (blank nodes). These nodes are used in triples that link a subject via a predicate (property) to an object node. While RDF Schema has a notion of classes that bears similarities with GraphQL object types, there is a closer resemblance to the concept of shapes from the SHACL specification. Like GraphQL object types, SHACL shapes also define fields via so-called property shapes, and can include constructs to define the permissible value types of these fields, and many other constraint types. Furthermore, GraphQL is often implemented as a view over data that is stored elsewhere, in different forms. SHACL shapes also represent views on RDF nodes, allowing different dicing and slicing of data to support different use cases.

To bring these two worlds closer together we defined a mapping from GraphQL schemas to RDF/SHACL data models. As a result, structure from existing GraphQL systems can be re-used and JSON-based data can be seamlessly converted into RDF graphs, for example to accomplish data integration tasks.

All GraphQL documents described here are valid GraphQL syntax. A key design principle for us was to avoid syntactical elements that would break a GraphQL parser. In some places the directives extension point of GraphQL is used. Directives were specifically designed for tools to hook into GraphQL with information that is ignored by other tools that are not aware of their meaning. As soon as one or more tools agree on a set of such directives, a dialect of GraphQL emerges, and using these directives is not limited to one particular implementation approach. See GraphQL Data Shapes Directives for a general overview of most of the directives used here, written for users without prior knowledge of RDF technology. The rest of this page assumes familiarity with RDF and SHACL.

The following example GraphQL file defines a couple of object types, with fields and an enumeration.

# A user account
type User {
	name: String!
	age: Int               @shape(minInclusive: 18)
	gender: Gender
	purchases: [Purchase]
}

type Purchase {
	# The internal ID
	productId: String!     @shape(minLength: 8, pattern: "[0-9]+")
}

enum Gender {
	FEMALE
	MALE
}

The GraphQL schema above can be translated into the following RDF/SHACL, in Turtle notation. Note that in this document we use blank nodes to represent property shapes, for brevity. In many practical applications it is more sensible to use URIs for them, such as ex:User-name.

ex:User
	a sh:NodeShape ;
	rdfs:comment "A user account" ;
	sh:property [
		sh:path ex:name ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
		sh:order 0 ;
	] ;
	sh:property [
		sh:path ex:age ;
		sh:datatype xsd:integer ;
		sh:maxCount 1 ;
		sh:minInclusive 18 ;
		sh:order 1 ;
	] ;
	sh:property [
		sh:path ex:gender ;
		sh:maxCount 1 ;
		sh:node ex:Gender ;
		sh:order 2 ;
	] ;
	sh:property [
		sh:path ex:purchases ;
		sh:node ex:Purchase ;
		sh:order 3 ;
	] .

ex:Purchase
	a sh:NodeShape ;
	sh:property [
		sh:path ex:productId ;
		sh:datatype xsd:string ;
		sh:description "The internal ID" ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
		sh:minLength 8 ;
		sh:pattern "[0-9]+" ;
		sh:order 0 ;
	] .
	
ex:Gender
	a sh:NodeShape ;
	sh:in ( "FEMALE" "MALE" ) .

The remaining sections of this document drill down into the technical details of this mapping. The mapping is defined in the direction from GraphQL to SHACL, allowing all GraphQL schemas to be treated as RDF/SHACL models. The mapping can also be applied in reverse order, to produce GraphQL schemas from existing RDF/SHACL models. However, that reverse mapping is partial, i.e. not all SHACL constructs have a GraphQL equivalent.

GraphQL Names to URIs

The concept of URIs plays a key role in RDF modeling. URIs are used to uniquely identify resources, properties and even graphs. GraphQL does not natively have a concept of global identifiers, nor does it have a concept of namespaces (all GraphQL names are simple Java-like identifiers). Therefore, we designed conventions and instructions on how to turn any GraphQL schema into RDF graphs and URIs.

Identifying Graphs and their Imports

A graph name is a URI that identifies an RDF graph in a data set and in Linked Data use cases. In RDF, graphs may import each other and then reference each other's terms. The GraphQL directive @graph can be used with a GraphQL schema definition, to declare the URI of the graph and any imported graphs. This is illustrated in the following example:

schema
	@graph (
		uri: "http://example.org/myGraph",
		imports: [
			"http://example.org/someGraph1",
			"http://xmlns.com/foaf/0.1/"
		]
	)
	@prefixes (
		rdfs: "http://www.w3.org/2000/01/rdf-schema#",
		ex: "http://example.org/myGraph/ (default)"
	)
{
	query: Query
}

The corresponding Turtle for the GraphQL snippet above is the following:

@prefix dash: <http://datashapes.org/dash#> .
@prefix ex: <http://example.org/myGraph/> .
@prefix graphql: <http://datashapes.org/graphql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/myGraph>
	a owl:Ontology ;
	a graphql:Schema ;
	owl:imports <http://example.org/someGraph1> ;
	owl:imports <http://xmlns.com/foaf/0.1/> ;
	sh:declare ex:PrefixDeclaration ;
	graphql:defaultPrefix ex:PrefixDeclaration ;
	graphql:queryShape ex:Query ;
	graphql:publicShape ex:Query .
	
ex:PrefixDeclaration
	a sh:PrefixDeclaration ;
	sh:prefix "ex" ;
	sh:namespace "http://example.org/myGraph"^^xsd:anyURI .

Note that the converter has inserted default prefixes for a collection of the well-known namespaces owl, rdf, rdfs, sh and xsd. These are always assumed to be present, e.g. for parsing qnames during conversion.

At a schema definition, the directive @graph can have an argument uri of type String to specify the URI of the (named) graph itself. If no such argument has been found in the GraphQL document, then the system will use the declared default namespace (see the next section), if that exists. If no default namespace has been declared, then the surrounding code is expected to provide a default graph URI. This may, for example, be a URL related to the GraphQL service or a URL derived from the name of the GraphQL schema file.

The resource representing the named RDF graph gets rdf:type graphql:Schema. The use of the rdf:type owl:Ontology is optional, yet recommended.

The argument imports of the @graph directive takes an array with items of type String, each of which is turned into an owl:imports statement for the graph.

Namespace Prefixes

The GraphQL directive @prefixes can be used to declare RDF namespace prefixes. These prefixes are used for the remainder of the conversion to turn GraphQL names into RDF IRIs. The rule is that if a GraphQL name starts with a declared prefix and then the underscore, then the IRI will be the namespace of the given prefix plus the remainder of the GraphQL name. So with the prefix declaration above, the GraphQL name rdfs_Class is expanded into the RDF resource rdfs:Class, aka http://www.w3.org/2000/01/rdf-schema#Class.

GraphQL names that do not match a given prefix (i.e. any plain name without underscores) are mapped to URIs based on a default namespace. That default namespace can be defined explicitly by adding " (default)" to the end of a declared namespace, e.g.

schema
	@prefixes (
		ex: "http://example.com/ (default)"
	) ...

In this case, a GraphQL name person would become the URI http://example.com/person. If no such default namespace has been defined, the system will derive a default namespace from the graph URI (using @graph(uri: ...) as described in the previous section): If the graph URI ends with one of the gen-delim characters such as /, # or : then it will become the default namespace. Otherwise, the default namespace will be the graph URI plus the # character.

The Root Query and Public Shapes

As explained in the following sections, GraphQL types are converted to SHACL shapes. The property graphql:queryShape can be used to remember the SHACL shape that was created from the GraphQL type referenced by query in the schema declaration. The subject of the graphql:queryShape is the graph resource itself. No graphql:queryShape triple is created if the GraphQL root query type is called _.

All shapes created for GraphQL object types and interfaces are recorded in the RDF schema resource as values of graphql:publicShape. This makes it possible to later reconstruct which of the shapes (in a data set of multiple RDF graphs) belong together.

Types to Node Shapes

GraphQL object types, interface types and union types are mapped to SHACL node shapes as outlined in the following subsections.

Object Types

Each GraphQL type is turned into a SHACL node shape. As shown in the following example, sh:node will be used when this type is referenced by a field:

type Human {
	friends: [Human]
}

ex:Human
	a sh:NodeShape ;
	sh:property [
		sh:path ex:friends ;
		sh:node ex:Human ;
	] .

Use the @class directive if the type shall also be turned into an rdfs:Class:

type Human @class {
	...
}

ex:Human
	a sh:NodeShape ;
	a rdfs:Class ;
	...

To let our engine know that it should convert all GraphQL types in your file into shapes that are also classes, annotate the schema with the @classes directive. Individual types can then override this default using @noClass:

schema @classes
	...

type HumanClass {           # This becomes a sh:NodeShape + rdfs:Class
	...
}

type HumanShape @noClass {  # This becomes a sh:NodeShape only
	...
}

Note that if the value type of a field is converted into a node shape that is also a class (either through the @class directive or the @classes directive on the whole schema), then the property sh:class will be used instead of sh:node to link shapes via fields.

Use the subClassOf parameter of the @class directive to specify one or more super classes, producing rdfs:subClassOf statements:

type Customer @class(subClassOf: Human) {
	...
}

ex:Customer
	a sh:NodeShape ;
	a rdfs:Class ;
	rdfs:subClassOf ex:Human ;
	...

Use the @display directive with the label argument to specify value of rdfs:label for the resulting shape. Labels can include a language tag as shown in the example:

type Customer @display(label: "Customer", label_de: "Kunde") {
	...
}

ex:Customer
	a sh:NodeShape ;
	rdfs:label "Customer" ;
	rdfs:label "Kunde"@de ;
	...

Each GraphQL type can be annotated with the @shape directive to give additional input to the conversion. @shape can take a parameter targetClass that is translated into one or more sh:targetClass statements on the node shape. The values of targetClass and subClassOf must be either:

GraphQL names (such as targetClass: Person)
Strings (such as targetClass: "ex:Person")
Arrays of the above.

In the case of strings, the values can be RDF qnames (using the defined namespace prefixes) or, if this fails, full URIs. Strings may also be GraphQL names. The following example refers to the class ex:Human, assuming that ex: is the prefix of the default namespace:

type Customer @shape(targetClass: Human) {
	...
}

ex:Customer
	a sh:NodeShape ;
	sh:targetClass ex:Human ;
	...

The GraphQL Data Shapes Directives document introduced the notion of URI templates that can be attached to GraphQL type definitions as follows:

type Human @uri(template: "http://example.org/human/{$id}") {
	id: ID!
	name: String!
	friends: [Human]
}

These values get translated into values of the dedicated property graphql:uriTemplate:

ex:Human
	a sh:NodeShape;
	graphql:uriTemplate "http://example.org/human/{$id}" ;
	...

Interface Types

GraphQL interfaces play a very similar role as object types, e.g. they can be used as type of a field. The mapping of interfaces to SHACL is similar to that of object types. To distinguish interfaces from object types, shapes created from interfaces have the marker property graphql:isInterface set to true.

The only added feature is that object types can implement interfaces, creating a one-level-deep form of type extension or inheritance. In the RDF world such type extension is sometimes represented using rdfs:subClassOf and in the case of SHACL via an sh:node link from the "subclass" to the "superclass". Intuitively, ex:SubShape sh:node ex:SuperShape means that any instance that conforms to ex:SubShape must also conform to ex:SuperShape.

interface Character {
	id: ID!
}

type Human implements Character {
	id: ID!
	friends: [Character]
}

ex:Character
	a sh:NodeShape ;
	graphql:isInterface true ;
	sh:property [
		sh:path ex:id ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
	] .

ex:Human
	a sh:NodeShape ;
	sh:node ex:Character ;
	sh:property [
		sh:path ex:friends ;
		sh:node ex:Character ;
    ] ;
	sh:property [
		sh:path ex:id ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
	] .

Union Types

GraphQL unions are equivalent to shapes that are the sh:or of multiple other shapes.

type Human {
	name: String
}

type Starship {
	length: Int
}

union SearchResult = Human | Starship

ex:Human
	a sh:NodeShape ;
	sh:property [
		sh:path ex:name ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
	] .

ex:Starship
	a sh:NodeShape ;
	sh:property [
		sh:path ex:length ;
		sh:datatype xsd:integer ;
		sh:maxCount 1 ;
	] .

ex:SearchResult
	a sh:NodeShape ;
	sh:or (
		ex:Human
		ex:Starship
	) .

Fields to Property Shapes

Each GraphQL field declaration (from object types and interface types) gets mapped into a SHACL property shape, connected to the corresponding node shape via sh:property. The details of how to construct these property shapes are described in the following sub-sections.

Field Names to Paths

By default, the name of a field gets translated into a URI for a property following the namespace-based syntax rules from above. These then become values of sh:path in the property shape, as already shown in many examples.

SHACL also supports complex path expressions, using a SPARQL-based syntax, that can be used to walk properties in the inverse direction or take multiple steps at once. To produce such paths, use the @shape directive with path as shown:

type Class {
	superClasses: [Class]  @shape(path: "^rdfs:subClassOf")
}

ex:Class
	a sh:NodeShape ;
	sh:property [
		sh:path [ sh:inversePath rdfs:subClassOf ] ;
		graphql:name "superClasses" ;
		sh:node ex:Class ;
    ] .

The values of path must be SPARQL path expressions that can be parsed using the available namespace prefixes.

The property graphql:name is recommended to remember the original GraphQL name in case the shapes are later translated back to a GraphQL schema. graphql:name is mandatory for property shapes that use path expressions.

Shall we support a special syntax that does not need prefixes, e.g. @shape(path: INV_subClassOf) for sh:inversePath, or @shape(inversePath: subClassOf)? The inverse use case is very common, so syntactic sugar may help.

Scalar Types

If the type of a GraphQL field is a scalar type, then there is a sh:datatype constraint in the property shape. The datatype is selected according to the following table:

GraphQL Scalar Type	RDF Data Type
Boolean	xsd:boolean
Float	xsd:decimal
ID	xsd:string
Int	xsd:integer
String	xsd:string

Property shapes derived from ID fields get the value true for the property graphql:isIDField true in addition to the sh:datatype xsd:string.

For explicitly declared scalar types that go beyond the GraphQL standard the system will by produce an instance of graphql:ScalarType which is then referenced using sh:node. A user-defined scalar type can be annotated with the datatype argument in the @shape directive as follows:

scalar Content @shape(datatype: "rdf:HTML")

type WebPage {
	html: Content
}

ex:WebPage
	a sh:NodeShape ;
	sh:property [
		sh:path ex:html ;
		sh:node ex:Content ;
		sh:maxCount 1 ;
    ] .

ex:Content
	a graphql:ScalarType ;
	graphql:datatype rdf:HTML .

Enum Types

GraphQL enum types (used to describe the permissible values of certain fields) become node shapes with an sh:in constraint type.

type Unicorn {
	colors: Color
}

enum Color {
	# Yellow is our least favorite color
	YELLOW
	RED
	PINK
}

ex:Unicorn
	a sh:NodeShape ;
	sh:property [
		sh:path ex:colors ;
		sh:node Color ;
		sh:maxCount 1 ;
	] .

ex:Color
	a sh:NodeShape ;
	sh:in (
		"YELLOW"
		"RED"
		"PINK"
	) .

If the enum has a @class directive, then the resulting node shape also becomes a class, and each value an instance of that class, with the name of the value as its rdfs:label. This option also allows the values to contain comments:

ex:Color
	a sh:NodeShape ;
	a rdfs:Class ;
	sh:in (
		ex:Color-YELLOW
		ex:Color-RED
		ex:Color-PINK
	) .

ex:Color-YELLOW
	a ex:Color ;
	rdfs:label "YELLOW" ;
	rdfs:comment "Yellow is our least favorite color" .

... for RED and PINK

Handling of Language-Tagged Strings

In RDF, the special datatype rdf:langString is used to represent language-tagged strings, such as "Haus"@de and "House"@en. Use the GraphQL object type LangString to use this type. It maps to JSON objects with two fields: string (the lexical value such as "Haus") and lang (the language tag such as "de"):

type Concept {
	prefLabel: [LangString]
}

ex:Concept
	a sh:NodeShape ;
	sh:property [
		sh:path ex:prefLabel ;
		sh:datatype rdf:langString ;
	] .

GraphQL schema files that reference this LangString type should declare it as follows:

type LangString {
	string: String!
	lang: String
}

The lang field is optional, because there is frequently encountered RDF case where strings may or may not have a language tag, which can be expressed as following in SHACL:

ex:Concept
	a sh:NodeShape ;
	sh:property [
		sh:path ex:prefLabel ;
		sh:or ( 
			[ sh:datatype xsd:string ]
			[ sh:datatype rdf:langString ]
		)
	] .

This SHACL design pattern does not have a direct equivalent in GraphQL schema syntax, yet is supported by tools that operate on SHACL directly.

Single-valued Fields to sh:maxCount 1

GraphQL fields that take single values, i.e. non-arrays/lists, are mapped to a sh:maxCount 1 constraint as shown:

type Person {
	name: String
	friends: [Person]
}

ex:Person
	a sh:NodeShape ;
	sh:property [
		sh:path ex:friends ;
		sh:node ex:Person ;
	] ;
	sh:property [
		sh:path ex:name ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
    ] .

Required Fields to sh:minCount 1

Single-valued GraphQL fields marked with a ! are mapped to a sh:minCount 1 constraint as shown below.

type Person {
	name: String!
}

ex:Person
	a sh:NodeShape ;
	sh:property [
		sh:path ex:name ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
    ] .

Note that this rule does not apply to list-valued GraphQL fields, because the semantics of ! means "not null" which includes the empty array [].

Order of Fields

The property sh:order MAY be used to record the relative order of fields from the original GraphQL type. By default, the first property shape will get sh:order "0"^^xsd:decimal, etc. However, if a field declares a different order using @display(order: 7) then this number is used instead.

Input Types

GraphQL input types are used to formalize the arguments of fields in query instances. They are not mapped to shapes but to a specialized structure from the graphql: namespace. These structures can be useful for round-tripping of GraphQL documents, or to perform RDF queries over them, for example to explore linkage between various web services.

type Query {
	hero (name: String!): Character
}

ex:Query
	a sh:NodeShape ;
	sh:property [
		sh:path ex:hero ;
		sh:maxCount 1 ;
		graphql:inputValue [
			a graphq:InputValue ;
			sh:order "0"^^xsd:decimal ;
			graphql:name "name" ;
			graphql:type [
				a graphql:NonNullType ;
				graphql:type [
					a graphql:NamedType ;
					graphql:name "String" ;
				]
			]
		] ;
    ] .

The example above is hopefully sufficient to get started. The graphql: namespace defines the three value type classes graphql:NamedType (with property graphql:name), graphql:NonNullType (with property graphql:type) and graphql:ListType (with property graphql:memberType).

Directives for SHACL Constraints

The @shape directive can be used to attach other SHACL constraints to property shapes. These declarations may also be good practices for GraphQL schema development in general, and can also be used by non-RDF tools. The example below states that age >= 18.

type Adult {
	age: Int  @shape(minInclusive: 18)
}

ex:Adult
	a sh:NodeShape ;
	sh:property [
		sh:path ex:age ;
		sh:datatype xsd:integer ;
		sh:maxCount 1 ;
		sh:minInclusive 18 ;
	] .

We have defined an easy-to-use set of GraphQL directives that can significantly improve the value of GraphQL schemas for JSON-based data processing. They are described in detail on the GraphQL Data Shapes Directives page which includes a table showing all supported constraint types. They intuitively map to corresponding SHACL constraints from the sh: namespace. They intuitively map to corresponding SHACL constraints from the sh: namespace.

We have intentionally left out some of the RDF-specific constraint types. For example, sh:languageIn and sh:uniqueLang may not be frequently needed from a GraphQL perspective where no language tags exist. sh:property has been left out because it is already covered via fields. sh:closed has been left out because it does not really make sense for property shapes. The shape-based constraint types sh:not, sh:and, sh:or, sh:xone and sh:qualifiedValueShape have been left out to reduce complexity. sh:nodeKind has been left out because GraphQL strictly separates literals and non-literals, and the distinction between blank nodes and URIs is not relevant because they are enforced by @uri directives. Any of them may however be supported in the future should users require them.

Directives for Display Metadata

In addition to constraints, property shapes may include various annotation properties. The GraphQL Data Shapes Directives page illustrates various use cases such as form building and comes with an example. Here we use the same examples with the equivalent RDF/SHACL triples.

schema
	@groups(
		NamesGroup: {
			label: "Names"
		},
		AddressGroup: {
			label: "Address"
			label_de: "Addresse"
		}
	)
...

type Customer {
	firstName: String			@display(group: NamesGroup,    label: "given name")
	lastName: String			@display(group: NamesGroup,    label: "family name")
	street: String				@display(group: AddressGroup)
	postalCode: String		@display(group: AddressGroup, label: "zip code", label_de: "Postleitzahl")
	country: String			@display(group: AddressGroup, defaultValue: "USA")
}

ex:NamesGroup
	a sh:PropertyGroup ;
	rdfs:label "Names" ;
	sh:order "0"^^xsd:decimal .

ex:AddressGroup
	a sh:PropertyGroup ;
	rdfs:label "Address" ;
	rdfs:label "Addresse"@de ;
	sh:order "1"^^xsd:decimal .

ex:Customer
	a sh:NodeShape ;
	sh:property [
		sh:path ex:firstName ;
		sh:datatype xsd:string ;
		sh:maxCount 1 ;
		sh:order "0"^^xsd:decimal ;
		sh:group ex:NamesGroup ;
		sh:name "given name" .
...