Introduction to ADS

Active Data Shapes (ADS) is an innovative scripting technology that enables people with JavaScript skills to interact with the knowledge graphs stored within the EDG asset collections. ADS closely integrates declarative graph schema definitions (using SHACL shapes) with imperative scripting language code by means of an API generator that mirrors the shapes of the graph data as JavaScript classes. Using familiar design patterns, Active Data Shapes may significantly lower the cost of entry into graph technology for “mainstream” developers, while greatly enhancing the productivity of graph technology power users. Let’s look at some basic concepts of how ADS works.

Note

ADS scripts generally execute server-side on the TopBraid platform. This means that ADS scripts can very quickly query and modify the data stored in the graphs.

ADS scripts can be edited with and executed from TopBraid EDG, in particular through the Script Editor panel. TopBraid provides various mechanisms to store ADS scripts alongside the ontologies so that they can be executed repeatedly, for example as web services.

Warning

ADS is extremely powerful. Like with any full programming language, ADS scripts may contain errors that cause infinite loops and other nasty surprises. Scripts should only be edited by experts and be developed in a safe sandbox environment such as TopBraid EDG Studio. Furthermore, the administrator can disable editing ADS scripts for certain governance roles on the EDG server configuration parameters page. The JavaScript/ADS experts in your organization can prepare well-tested scripts for all other users.

ADS scripts may also be executed on the Node.js platform. In that role, ADS serves as a client API for external Node.js applications to interact with TopBraid EDG. Finally, ADS code may be embedded into Web applications to simplify the development of components that execute in the browser. However, even in that role, ADS scripts are always executed server-side. We will get to this later.

All ADS scripts are written in JavaScript (in particular: ECMAScript) and can access all built-in features of the ECMAScript API such as Objects, Arrays and String functions. Since ADS scripts are executing server-side, they do not have access to client-side JavaScript libraries such as DOM manipulation and events.

In addition to the standard ECMAScript features, ADS scripts can access two sets of APIs:

  • The Core API contains the generic features to interact with any RDF graph and basic I/O functionality

  • The Auto-Generated APIs contain classes and functions that are based on the ontologies that are included by the currently active asset collection

Hint

Use the Script API Viewer panel to see the source code of all available ADS features. The Script Editors also provide auto-complete. For example, start typing graph. to see the most important functions of the Core API.

Core API

The ADS Core API is available for all asset collections. You can see the generated documentation for the Core API from the highlighted button of the Script Editor panel:

The Script Editor panel with the API button

The Script Editor supports auto-complete and has a button to open the Core API documentation

The main entry point into the API is the object graph which offers various functions to query the currently active graph and to construct other objects such as RDF nodes. It is important to understand how RDF nodes are represented in the ADS API:

Core classes of the ADS API

The core classes used by ADS to represent RDF nodes

Every node is an instance of GraphNode. RDF literals (strings, booleans, dates etc) are represented as instances of LiteralNode. You can query the lexical form of a literal using its field lex. All literals have a datatype property holding the URI of the RDF datatype, and a lang property with the language tag (usually an empty string unless the literal has datatype rdf:langString). All of those values are read-only properties.

You can use graph.node(...) and other convenience methods such as graph.langString(lex, lang) to produce new nodes if you need them. For example you can use graph.literal(4.2) to produce a instance of LiteralNode with 4.2 as lexical form and datatype xsd:decimal. Or use graph.node({lex: "2020-06-01", datatype: xsd.date}) for an xsd:date literal. (In case you wondered, xsd.date is part of the automatically generated APIs for the XSD namespace.)

Non-literals are represented as instances of NamedNode and they all have a uri property holding the identifier of the node. (For the RDF geeks, we represent blank nodes also as named nodes, only with a URI consisting of _: and then the blank node identifier. Use .isBlankNode() and .isURI() if you need to distinguish between them.) URIs are also read-only, but you can construct new NamedNodes with graph.namedNode("...") or graph.namedNode({qname: "..."}) and with the factory methods from the generated namespaces explained below.

Tip

Keep in mind that you cannot simply use JavaScript == operators on instances of these node types. This is because the same URI node may have multiple JavaScript objects over the life time of your script. Instead, use nodeA.equals(nodeB) to compare them, or to find an item in an array.

NamedNode has various functions to query RDF property values. For example you can use focusNode.values(skos.broader) to return an array of values for skos:broader of the focus node. However, typically you may prefer to go through the much more readable properties from the generated APIs.

Note

Each ADS API can be used in read-only mode and read/write mode. Read-only mode does not contain functions and setters to modify the underlying graph and is therefore a “safer” choice if you just want to query. Use the “lock” button of the Script Editor to switch between these two modes.

Auto-Generated API

In EDG, domain experts define the required data model using ontologies and SHACL shapes. Internally, ADS generates a JavaScript API based on the system and user-defined SHACL shapes. For example, the SKOS ontology has a class skos_Concept with properties such as prefLabel and broader generated in the API. These properties can be used as getters and setters.

Node classes with skos:Concept

Additional subclasses of NamedNode are automatically generated from the Ontology

This auto-generated API can be used from all asset collections that include the corresponding ontologies. For example, if you have a Geography Taxonomy that owl:imports the SKOS SHACL shapes, then the skos_Concept features become available to ADS scripts.

A typical process includes:

  1. Domain experts/Information architects defining the required data models using ontologies and SHACL shapes;

  2. Internally, EDG generated the required JavaScript API functions (e.g various getters and setter classes for defined properties);

  3. Developers can declaratively define custom methods that extend the automatically generated API and interact with the graphs stored in EDG.

The API generator will produce one JavaScript class for each named SHACL node shape (or class that is also a sh:NodeShape). The name of the class will be the prefix of the namespace plus underscore plus the local name of the class. For example, the RDFS class skos:Concept becomes skos_Concept in JavaScript. Each property shape for such node shapes is mapped to a JavaScript property based on the local name of the property that is the sh:path of the property shape. For example, skos:broader is mapped to a JavaScript property broader at the class skos_Concept. If no suitable local name exists (e.g. if it contains a dash or other unsuitable characters), or if you want to use a plural name such as broaderConcepts, you can specify a different name using the property graphql:name at the property shape.

Note

The properties of the generated JavaScript APIs are backed by getters and setters which means that their values are fetched from the RDF database when requested, and assignments of the JavaScript property will actually create new triples in the data graph.

The API generator will also produce one JavaScript object for selected namespace prefixes that contains either classes, node shapes, properties or datatypes. These prefix objects have automatically generated factory methods such as skos.asConcept(...) that can be used to conveniently create new instances. For example use skos.asConcept(g.NS + 'Canada') to produce an instance of skos_Concept for the node with the given URI.

Note

A key feature of Active Data Shapes is polymorphism, which means that you can cast any named node into an instance of any other named node class. For example, you can use skos.asConcept(anyNode).broader to convert the given named node into an instance of skos_Concept and then fetch its broader concepts. The generated prefix objects also contain identifiers for any declared datatype, class, node shape or property from that namespace. For example, you can use xsd.string to access a NamedNode for the xsd:string datatype.

When you have created your own ontology and want the system to generate API for your classes or your namespace, go to the “Home” asset of your Ontology and switch the form to the Script API page.

Script API generation options

Select the classes and namespaces for the API generation

On that Script API page, use:

  • generate class to select individual classes or shapes that shall be generated

  • generate prefix constants to include JavaScript constants for all classes, shapes, datatypes and properties from the namespaces, e.g. skos.broader

  • generate prefix classes to generate JavaScript classes for all classes and shapes from the given namespaces

Warning

Avoid generating huge APIs that consume too many server resources. Do not use generate prefix classes for namespaces containing thousands of classes or shapes.

Prefix objects such as skos will also contain JavaScript functions that mirror any declared SHACL function, making it easy to call most SPARQL functions with a familar JavaScript syntax. As an example from the built-in functions, tosh:hasShape can be called using tosh.hasShape(...) from your ADS scripts. Any function that shall be included in the API needs to have a value for the property dash:apiStatus, e.g. dash:Stable. See Script-based Functions.

One big advantage of going through such domain-specific JavaScript APIs instead of a generic API is that you can benefit from the code completion and other IntelliSense features of your JavaScript editor, including the online editor bundled with TopBraid EDG. Using the API generated from the SHACL shapes, the editor will know in advance that the values of broader are again instances of skos_Concept and therefore can help you select the next operation on those values. Furthermore, if a shape declares that a certain property has sh:datatype xsd:string then the API will directly return a native JavaScript string, it will return native booleans for xsd:boolean and produce native numbers for any numeric datatype such as xsd:integer and xsd:decimal. This means you can use familar programming idioms to ask queries such as country.areaKM > 200000.

Furthermore, the generated API can make deeper use of the SHACL shape definitions, for example to recognize that some property values should be sorted by their dash:index, meaning that arrays will maintain their order. Future versions may also perform basic constraint checks before a property value can be assigned.

Note

The JavaScript properties based on property shapes may use complex path expressions (including inverse paths) and even inferred values that are computed dynamically using sh:values rules. However, only non-inferred paths that consist of a simple predicate can be assigned using = - the others are read-only.

Important

Any RDF statements that are relevant to ADS code generation must be represented in either Ontology asset collections or Files in the workspace. It is not supported to store shape definitions or any other triples that impact the code generation in other asset collection types such as Taxonomies.

Shape Scripts

One way of extending the generated ADS APIs is to define shape scripts. This is basically a way to programmatically define extra functions associated with node shapes. These definitions are injected into the API for consumption by other ADS scripts. For example, we might want to define how to output the JSON structure of instances (focusNode) of a particular node shape/class for a particular API output.

Hint

Since shape scripts are injected into class definitions they access the current object via the variable this. So if you are developing and testing your features using the Script Editor panel, make sure to not accidentally use focusNode instead of this.

Shape scripts are defined directly on ontologies. EDG provides a Shape Script editor panel, which when selecting a node shape, the panel auto-populates with the class header, where any definition for the selected node shape would be implemented.

TopBraid EDG Shape Script Panel

TopBraid EDG Shape Script Panel in the Ontologies asset collection

On saving, EDG will automatically create an instance of type dash:ShapeScript which is attached to the selected node shape. This will ensure that the next time the node shape is selected, the defined function is shown in the shape script panel.

Shape Script Examples

In the first two examples we show how we can define custom convenience functions that can be used in other shape scripts (as we will demonstrate later) and ADS functions.

 1class extends skos_Concept {
 2    /**
 3    * A convenience function that converts a prefLabel to string
 4    * Param: prefLabel resource (if known) - eg pref label for a particular language
 5    * or null
 6    */
 7    prefLabelToString(prefLabel){
 8        return prefLabel ? prefLabel.lex : this.prefLabel.map(label => label.lex)
 9    }
10}

The function prefLabelToString(prefLabel) is defined on the skos:Concept node shape. By default, when accessing the skos:prefLabel of the current focus node, ADS will return an array of prefLabel nodes, containing the lexical value and the language. This function has two options: (a) if a particular prefLabel node - say a pref label for a particular language - is passed as a parameter, the function will return a string with the lexical value of the prefLabel; (b) if no parameter is passed, then the shape script will return a flat array, where the items of the array will be just a string representation of the lexical values.

Diving deeper in the code snippet, we use this.prefLabel.map(...). This means that properties available to the current focus node (which in this case would be an instance of skos:Concept) are also available in this custom defined function. Furthermore, any instances whose node shape is a subclass of skos:Concept will have the function prefLabelToString(prefLabel) available. This happens either through inheritance or, if the class in the ontology has multiple parents, by re-declaration.

Here is another example shape script:

1class extends g_GeoConcept {
2    /**
3     * A function that returns the pref label of a geo concept
4     * for a specific language.
5     */
6    prefLabelByLang(lang){
7        return this.prefLabel.find(label => label.lang == lang)
8    }
9}

Once these shape scripts are defined, we can use them in the Script Editor panel in other asset collections where the ontology is included. In this case, we’ll be using the Geography Taxonomy asset collection. The script editor picks up the newly defined functions:

ADS Shape Script Example

Using defined shape scripts in ADS - Intellisense

Note

Every time shape scripts or any other ADS functions are modified in the ontology, we need to refresh the ADS API. This can be done by clicking the refresh button on any ADS panel.

The next two examples demonstrates the use of the previously defined shape scripts (prefLabelByLang(...) and prefLabelToString(...)) in our custom toJSON(...) function for the City and the Country node shapes respectively.

 1class extends g_City {
 2    /**
 3     * toJson Function for Cities
 4     */
 5    toJSON(lang){
 6        return {
 7            uri: this.uri,
 8            label: lang ? this.prefLabelByLang(lang).lex : this.prefLabelToString(),
 9        }
10    }        
11}
 1class extends g_Country {
 2    /**
 3     * toJson Function for Countries
 4     */
 5    toJSON(lang){
 6        return {
 7            uri: this.uri,
 8            label: lang ? this.prefLabelByLang(lang).lex : this.prefLabelToString(),
 9            region: this.related.map(rel => { return { uri: rel.uri } }),
10            cities: this.narrower.map(city => g.asCity(city).toJSON(lang))
11        }
12    }   
13}

The figure below depicts the ouput of the toJSON() method for the country instance Germany:

ADS Shape Script Example and Output

Using defined shape scripts in ADS - Output

Included Scripts

The instances of the class dash:IncludedScript are another technique to extend the generated ADS APIs. The code associated with instances of this class will get injected into the generated APIs, as global code snippets. Typically used to declare libraries of utility functions or constants that are (compared to shape scripts) not necessarily associated with specific classes or shapes.

Note that the JavaScript code stored in dash:js cannot use the export keyword because the code must also work in external scripts (such as on Node.js). Instead, if you plan to use Using ADS from Node.js, you need to enumerate the exported symbols via dash:exports.

Architecture of ADS

For those who want to get a better understanding of how ADS works, the diagram below illustrates the runtime architecture.

../_images/ADS-Architecture-Tree.png

Runtime Architecture of Active Data Shapes

The base of the architecture, at the bottom, are the RDF graphs. In read/write mode, those graphs will be wrapped with a so-called DiffGraph that will collect the changes until they are committed at the end of a script. Access to the RDF graphs happens internally from a dedicated ADS Session object through the Apache Jena API. The actual ADS scripts are executed by the GraalVM engine.

The Core API and the Generated APIs constitute the available runtime APIs. From within TopBraid EDG, ADS scripts are executed from numerous places including inference rules, Explore/Modify actions and web services.

From Node.js, the ADS API can be either used on the Node.js platform, or through a function injection mechanism directly on the EDG server. From Web and React applications running in a web browser, ADS snippets can be sent to the EDG server for server-side execution. These topics will be covered in Using ADS from Node.js and Using ADS from Web Applications and React.

The table below illustrates the main differences between an EDG implementation, a Web/React components implementation, and a NodeJS implementation:

EDG Implementation

Web/React Components Implementation

Node.js Implementation

Can attach script to Explore and Modify drop down buttons in forms.

Allows the creation of custom panels that are installed in EDG.

A complete independent application with its own server, which connects to EDG via the ADS broker.

EDG automatically creates dialog boxes and UI components for parameters.

Requires the implementation of any extra UI components within the panel and EDG.

UI is implemented and handled in the Node.js application and is not integrated within EDG.

Permissions based on the user logged in EDG.

Permissions based on the user logged in EDG.

Additional permissions required for the Node.js application.

Limited use of third-party libraries.

Can use third-party client libraries.

Can use third-party Node.js libraries.

Scripts (ADS and JavaScript functions) are executed on the same EDG backend/server (GraalVM).

ADS functions are executed on the same EDG back end/server(GraalVM), whilst the rest of the JavaScript functions are executed on the client (through the React ecosystem).

ADS functions are installed and executed on the EDG back end/server(GraalVM), whilst the rest of the JavaScript functions are executed on the server where the Node.js application resides.

Implemented and stored within the EDG workspace.

Implemented externally and installed in EDG as a project.

External application.

No communication overhead.

Has slight communication overhead between EDG and the React-based UI.

Communication overhead between Node.js and EDG. However, this is compensated by complex algorithms being executed outside of the EDG domain.