.. include:: /includes.rst.txt .. _scripting_introduction: Introduction to ADS =================== *Active Data Shapes* (ADS) is an innovative scripting technology that enables people with JavaScript skills to interact with the knowledge graphs stored within the EDG asset collections. ADS closely integrates declarative graph schema definitions (using SHACL shapes) with imperative scripting language code by means of an API generator that mirrors the shapes of the graph data as JavaScript classes. Using familiar design patterns, Active Data Shapes may significantly lower the cost of entry into graph technology for "mainstream" developers, while greatly enhancing the productivity of graph technology power users. Let's look at some basic concepts of how ADS works. .. note:: ADS scripts generally execute server-side on the TopBraid platform. This means that ADS scripts can very quickly query and modify the data stored in the graphs. ADS scripts can be edited with and executed from TopBraid EDG, in particular through the Script Editor panel. TopBraid provides various mechanisms to store ADS scripts alongside the ontologies so that they can be executed repeatedly, for example as web services. .. warning:: ADS is extremely powerful. Like with any full programming language, ADS scripts may contain errors that cause infinite loops and other nasty surprises. Scripts should only be edited by experts and be developed in a safe sandbox environment such as TopBraid EDG Studio. Furthermore, the administrator can disable editing ADS scripts for certain governance roles on the EDG server configuration parameters page. The JavaScript/ADS experts in your organization can prepare well-tested scripts for all other users. ADS scripts may also be executed on the Node.js platform. In that role, ADS serves as a client API for external Node.js applications to interact with TopBraid EDG. Finally, ADS code may be embedded into Web applications to simplify the development of components that execute in the browser. However, even in that role, ADS scripts are always executed server-side. We will get to this later. All ADS scripts are written in JavaScript (in particular: ECMAScript) and can access all built-in features of the ECMAScript API such as Objects, Arrays and String functions. Since ADS scripts are executing server-side, they *do not* have access to client-side JavaScript libraries such as DOM manipulation and events. In addition to the standard ECMAScript features, ADS scripts can access two sets of APIs: * The Core API contains the generic features to interact with any RDF graph and basic I/O functionality * The Auto-Generated APIs contain classes and functions that are based on the ontologies that are included by the currently active asset collection .. hint:: Use the Script API Viewer panel to see the source code of all available ADS features. The Script Editors also provide auto-complete. For example, start typing `graph.` to see the most important functions of the Core API. .. _scripting_core_api: Core API -------- The ADS Core API is available for all asset collections. You can see the generated documentation for the Core API from the highlighted button of the Script Editor panel: .. figure:: _images/ScriptEditorAPIButton.png :alt: The Script Editor panel with the API button :align: center **The Script Editor supports auto-complete and has a button to open the Core API documentation** The main entry point into the API is the object `graph` which offers various functions to query the currently active graph and to construct other objects such as RDF nodes. It is important to understand how RDF nodes are represented in the ADS API: .. figure:: _images/CoreNodeClasses.png :alt: Core classes of the ADS API :align: center **The core classes used by ADS to represent RDF nodes** Every node is an instance of `GraphNode`. RDF literals (strings, booleans, dates etc) are represented as instances of `LiteralNode`. You can query the lexical form of a literal using its field `lex`. All literals have a `datatype` property holding the URI of the RDF datatype, and a `lang` property with the language tag (usually an empty string unless the literal has datatype `rdf:langString`). All of those values are read-only properties. You can use `graph.node(...)` and other convenience methods such as `graph.langString(lex, lang)` to produce new nodes if you need them. For example you can use `graph.literal(4.2)` to produce a instance of `LiteralNode` with `4.2` as lexical form and datatype `xsd:decimal`. Or use `graph.node({lex: "2020-06-01", datatype: xsd.date})` for an `xsd:date` literal. (In case you wondered, `xsd.date` is part of the automatically generated APIs for the XSD namespace.) Non-literals are represented as instances of `NamedNode` and they all have a `uri` property holding the identifier of the node. (For the RDF geeks, we represent blank nodes also as named nodes, only with a URI consisting of _: and then the blank node identifier. Use `.isBlankNode()` and `.isURI()` if you need to distinguish between them.) URIs are also read-only, but you can construct new `NamedNodes` with `graph.namedNode("...")` or `graph.namedNode({qname: "..."})` and with the factory methods from the generated namespaces explained below. .. tip:: Keep in mind that you cannot simply use JavaScript `==` operators on instances of these node types. This is because the same URI node may have multiple JavaScript objects over the life time of your script. Instead, use `nodeA.equals(nodeB)` to compare them, or to find an item in an array. `NamedNode` has various functions to query RDF property values. For example you can use `focusNode.values(skos.broader)` to return an array of values for `skos:broader` of the focus node. However, typically you may prefer to go through the much more readable properties from the generated APIs. .. note:: Each ADS API can be used in read-only mode and read/write mode. Read-only mode does not contain functions and setters to modify the underlying graph and is therefore a "safer" choice if you just want to query. Use the "lock" button of the Script Editor to switch between these two modes. .. _scripting_auto_generated_API: Auto-Generated API ------------------ In EDG, domain experts define the required data model using ontologies and SHACL shapes. Internally, ADS generates a JavaScript API based on the system and user-defined SHACL shapes. For example, the SKOS ontology has a class `skos_Concept` with properties such as `prefLabel` and `broader` generated in the API. These properties can be used as getters and setters. .. figure:: _images/NodeClassesWithSkosConcept.png :alt: Node classes with skos:Concept :align: center **Additional subclasses of NamedNode are automatically generated from the Ontology** This auto-generated API can be used from all asset collections that include the corresponding ontologies. For example, if you have a Geography Taxonomy that `owl:imports` the SKOS SHACL shapes, then the `skos_Concept` features become available to ADS scripts. A typical *process* includes: 1. Domain experts/Information architects defining the required data models using ontologies and SHACL shapes; 2. Internally, EDG generated the required JavaScript API functions (e.g various getters and setter classes for defined properties); 3. Developers can declaratively define custom methods that extend the automatically generated API and interact with the graphs stored in EDG. The API generator will produce one JavaScript class for each named SHACL node shape (or class that is also a `sh:NodeShape`). The name of the class will be the prefix of the namespace plus underscore plus the local name of the class. For example, the RDFS class `skos:Concept` becomes `skos_Concept` in JavaScript. Each property shape for such node shapes is mapped to a JavaScript property based on the local name of the property that is the `sh:path` of the property shape. For example, `skos:broader` is mapped to a JavaScript property `broader` at the class `skos_Concept`. If no suitable local name exists (e.g. if it contains a dash or other unsuitable characters), or if you want to use a plural name such as `broaderConcepts`, you can specify a different name using the property `graphql:name` at the property shape. .. note:: The properties of the generated JavaScript APIs are backed by *getters* and *setters* which means that their values are fetched from the RDF database when requested, and assignments of the JavaScript property will actually create new triples in the data graph. The API generator will also produce one JavaScript object for selected namespace prefixes that contains either classes, node shapes, properties or datatypes. These prefix objects have automatically generated factory methods such as `skos.asConcept(...)` that can be used to conveniently create new instances. For example use `skos.asConcept(g.NS + 'Canada')` to produce an instance of `skos_Concept` for the node with the given URI. .. note:: A key feature of Active Data Shapes is polymorphism, which means that you can cast any named node into an instance of any other named node class. For example, you can use `skos.asConcept(anyNode).broader` to convert the given named node into an instance of `skos_Concept` and then fetch its broader concepts. The generated prefix objects also contain identifiers for any declared datatype, class, node shape or property from that namespace. For example, you can use `xsd.string` to access a `NamedNode` for the `xsd:string` datatype. When you have created your own ontology and want the system to generate API for your classes or your namespace, go to the "Home" asset of your Ontology and switch the form to the *Script API* page. .. figure:: _images/GenerateScriptAPIOptions.png :alt: Script API generation options :align: center **Select the classes and namespaces for the API generation** On that Script API page, use: * *generate class* to select individual classes or shapes that shall be generated * *generate prefix constants* to include JavaScript constants for all classes, shapes, datatypes and properties from the namespaces, e.g. `skos.broader` * *generate prefix classes* to generate JavaScript classes for all classes and shapes from the given namespaces .. warning:: Avoid generating huge APIs that consume too many server resources. Do not use generate prefix classes for namespaces containing thousands of classes or shapes. Prefix objects such as `skos` will also contain JavaScript functions that mirror any declared SHACL function, making it easy to call most SPARQL functions with a familar JavaScript syntax. As an example from the built-in functions, `tosh:hasShape` can be called using `tosh.hasShape(...)` from your ADS scripts. Any function that shall be included in the API needs to have a value for the property `dash:apiStatus`, e.g. `dash:Stable`. See :ref:`ext_functions`. One big advantage of going through such domain-specific JavaScript APIs instead of a generic API is that you can benefit from the code completion and other IntelliSense features of your JavaScript editor, including the online editor bundled with TopBraid EDG. Using the API generated from the SHACL shapes, the editor will know in advance that the values of `broader` are again instances of `skos_Concept` and therefore can help you select the next operation on those values. Furthermore, if a shape declares that a certain property has `sh:datatype xsd:string` then the API will directly return a native JavaScript string, it will return native booleans for `xsd:boolean` and produce native numbers for any numeric datatype such as `xsd:integer` and `xsd:decimal`. This means you can use familar programming idioms to ask queries such as `country.areaKM > 200000`. Furthermore, the generated API can make deeper use of the SHACL shape definitions, for example to recognize that some property values should be sorted by their `dash:index`, meaning that arrays will maintain their order. Future versions may also perform basic constraint checks before a property value can be assigned. .. note:: The JavaScript properties based on property shapes may use complex path expressions (including inverse paths) and even inferred values that are computed dynamically using `sh:values` rules. However, only non-inferred paths that consist of a simple predicate can be assigned using `=` - the others are read-only. .. important:: Any RDF statements that are relevant to ADS code generation must be represented in either Ontology asset collections or Files in the workspace. It is not supported to store shape definitions or any other triples that impact the code generation in other asset collection types such as Taxonomies. .. _scripting_shape_scripts: Shape Scripts ------------- One way of extending the generated ADS APIs is to define *shape scripts*. This is basically a way to programmatically define extra functions associated with node shapes. These definitions are injected into the API for consumption by other ADS scripts. For example, we might want to define how to output the JSON structure of instances (`focusNode`) of a particular node shape/class for a particular API output. .. hint:: Since shape scripts are injected into class definitions they access the current object via the variable `this`. So if you are developing and testing your features using the Script Editor panel, make sure to not accidentally use `focusNode` instead of `this`. Shape scripts are defined directly on ontologies. EDG provides a Shape Script editor panel, which when selecting a node shape, the panel auto-populates with the class header, where any definition for the selected node shape would be implemented. .. figure:: _images/edg_shape_script_panel.png :alt: TopBraid EDG Shape Script Panel :align: center :class: edg-figure-xl **TopBraid EDG Shape Script Panel in the Ontologies asset collection** On saving, EDG will automatically create an instance of type `dash:ShapeScript` which is attached to the selected node shape. This will ensure that the next time the node shape is selected, the defined function is shown in the shape script panel. .. index:: pair: Shape Scripts ; Examples Shape Script Examples ^^^^^^^^^^^^^^^^^^^^^ In the first two examples we show how we can define custom convenience functions that can be used in other shape scripts (as we will demonstrate later) and ADS functions. .. literalinclude:: _code/_shape_scripts/geography_ontology_shape_scrips.js :linenos: :language: js :lines: 1-10 :emphasize-lines: 7-9 The function `prefLabelToString(prefLabel)` is defined on the `skos:Concept` node shape. By default, when accessing the `skos:prefLabel` of the current focus node, ADS will return an array of prefLabel nodes, containing the lexical value and the language. This function has two options: (a) if a particular prefLabel node - say a pref label for a particular language - is passed as a parameter, the function will return a string with the lexical value of the prefLabel; (b) if no parameter is passed, then the shape script will return a flat array, where the items of the array will be just a string representation of the lexical values. Diving deeper in the code snippet, we use `this.prefLabel.map(...)`. This means that properties available to the current focus node (which in this case would be an instance of `skos:Concept`) are also available in this custom defined function. Furthermore, any instances whose node shape is a subclass of `skos:Concept` will have the function `prefLabelToString(prefLabel)` available. This happens either through inheritance or, if the class in the ontology has multiple parents, by re-declaration. Here is another example shape script: .. literalinclude:: _code/_shape_scripts/geography_ontology_shape_scrips.js :linenos: :language: js :lines: 12-20 :emphasize-lines: 6-8 Once these shape scripts are defined, we can use them in the Script Editor panel in other asset collections where the ontology is included. In this case, we'll be using the Geography Taxonomy asset collection. The script editor picks up the newly defined functions: .. figure:: _images/ads_shape_script_example_1.png :alt: ADS Shape Script Example :align: center :class: edg-figure **Using defined shape scripts in ADS - Intellisense** .. note:: Every time shape scripts or any other ADS functions are modified in the ontology, we need to refresh the ADS API. This can be done by clicking the refresh button on any ADS panel. The next two examples demonstrates the use of the previously defined shape scripts (`prefLabelByLang(...)` and `prefLabelToString(...)`) in our custom `toJSON(...)` function for the `City` and the `Country` node shapes respectively. .. literalinclude:: _code/_shape_scripts/geography_ontology_shape_scrips.js :linenos: :language: js :lines: 22-32 :emphasize-lines: 5-10 .. literalinclude:: _code/_shape_scripts/geography_ontology_shape_scrips.js :linenos: :language: js :lines: 34-46 :emphasize-lines: 5-12 The figure below depicts the ouput of the `toJSON()` method for the country instance Germany: .. figure:: _images/ads_shape_script_example_2.png :alt: ADS Shape Script Example and Output :align: center :class: edg-figure-xl **Using defined shape scripts in ADS - Output** .. _scripting_included_scripts: Included Scripts ---------------- The instances of the class `dash:IncludedScript` are another technique to extend the generated ADS APIs. The code associated with instances of this class will get injected into the generated APIs, as global code snippets. Typically used to declare libraries of utility functions or constants that are (compared to shape scripts) not necessarily associated with specific classes or shapes. Note that the JavaScript code stored in `dash:js` cannot use the export keyword because the code must also work in external scripts (such as on Node.js). Instead, if you plan to use :ref:`scripting_nodejs`, you need to enumerate the exported symbols via `dash:exports`. Architecture of ADS ------------------- For those who want to get a better understanding of how ADS works, the diagram below illustrates the runtime architecture. .. figure:: _images/ADS-Architecture-Tree.png :align: center **Runtime Architecture of Active Data Shapes** The base of the architecture, at the bottom, are the RDF graphs. In read/write mode, those graphs will be wrapped with a so-called DiffGraph that will collect the changes until they are committed at the end of a script. Access to the RDF graphs happens internally from a dedicated ADS Session object through the Apache Jena API. The actual ADS scripts are executed by the GraalVM engine. The Core API and the Generated APIs constitute the available runtime APIs. From within TopBraid EDG, ADS scripts are executed from numerous places including inference rules, Explore/Modify actions and web services. From Node.js, the ADS API can be either used on the Node.js platform, or through a function injection mechanism directly on the EDG server. From Web and React applications running in a web browser, ADS snippets can be sent to the EDG server for server-side execution. These topics will be covered in :ref:`scripting_nodejs` and :ref:`scripting_web`. The table below illustrates the main differences between an EDG implementation, a Web/React components implementation, and a NodeJS implementation: .. table:: :widths: 33 33 33 :class: tight-table +----------------------+----------------------+----------------------+ | EDG Implementation | Web/React Components | Node.js | | | Implementation | Implementation | +======================+======================+======================+ | Can attach script to | Allows the creation | A complete | | Explore and Modify | of custom panels | independent | | drop down buttons in | that are installed | application with its | | forms. | in EDG. | own server, which | | | | connects to EDG via | | | | the ADS broker. | +----------------------+----------------------+----------------------+ | EDG automatically | Requires the | UI is implemented | | creates dialog boxes | implementation of | and handled in the | | and UI components | any extra UI | Node.js application | | for parameters. | components within | and is not | | | the panel and EDG. | integrated within | | | | EDG. | +----------------------+----------------------+----------------------+ | Permissions based on | Permissions based on | Additional | | the user logged in | the user logged in | permissions required | | EDG. | EDG. | for the Node.js | | | | application. | +----------------------+----------------------+----------------------+ | Limited use of | Can use third-party | Can use third-party | | third-party | client libraries. | Node.js libraries. | | libraries. | | | +----------------------+----------------------+----------------------+ | Scripts (ADS and | ADS functions are | ADS functions are | | JavaScript | executed on the same | installed and | | functions) are | EDG | executed on the EDG | | executed on the same | back | back | | EDG backend/server | end/server(GraalVM), | end/server(GraalVM), | | (GraalVM). | whilst the rest of | whilst the rest of | | | the JavaScript | the JavaScript | | | functions are | functions are | | | executed on the | executed on the | | | client (through the | server where the | | | React ecosystem). | Node.js application | | | | resides. | +----------------------+----------------------+----------------------+ | Implemented and | Implemented | External | | stored within the | externally and | application. | | EDG workspace. | installed in EDG as | | | | a project. | | +----------------------+----------------------+----------------------+ | No communication | Has slight | Communication | | overhead. | communication | overhead between | | | overhead between EDG | Node.js and EDG. | | | and the React-based | However, this is | | | UI. | compensated by | | | | complex algorithms | | | | being executed | | | | outside of the EDG | | | | domain. | +----------------------+----------------------+----------------------+