Exporting Data

TopBraid EDG supports the export of metadata and data to JSON-LD, RDF/XML, N-Triples, Turtle, Turtle+, TriG, JSON, CSV, TSV and XML through pre-built functions, a SPARQL endpoint and a GraphQL endpoint.

Search and the SPARQL Query and SPARQL Results panels support the export of subsets of the data in an asset collection based on custom criteria and sorting. These panels provide fine-grained control over the selection of data. Menus in these panels offer several choices to export the results into spreadsheet-compatible formats (e.g., for Excel) as well as other formats.

Bulk export to AWS S3 can be done through the Basket (see Bookmarking Asset Collections and Assets).

Export Collection as an RDF File

Export <collection type> as an RDF File is available under the Export tab.

These operations support exporting a collection’s data in a standard RDF serialization format.

TopBraid EDG Export Taxonomy Options

TopBraid EDG Export Taxonomy Options

Sorted Turtle includes an extension that supports the TopBraid reification capabilities. Reified statements in EDG are converted to standard RDF Statements in the exported file.

Sorted Turtle+ is an extension to the standard Sorted Turtle that supports TopBraid’s reification capabilities known as “RDF-star” or “RDF*”. In this case, the serialization is Turtle+ and requires a Turtle+ parser in order to import the file.

See also

RDF-star is not yet an official W3C standard, but work on the specification is well under way. TopQuadrant is one of the contributors to the specification document. In the meantime, many of the RDF technology vendors, including TopQuadrant, already implemented support for this concept.

Browser interactions during export vary: the data may be directly displayed, via a kind of view source command or the browser might provide (on the link) a right-click menu option to save the link target to a file, without first displaying the link result in the browser,. prompting for the file location and name.

The export of multiple asset collections is supported by placing them into the Basket and then selecting Export to S3 option.

TopBraid EDG Export Asset Collections Using Basket

TopBraid EDG Export Asset Collections Using Basket

This requires that at least one S3 bucket is configured by the EDG Administrator. Once set up, select a bucket using Select S3 Bucket for Exports under the Manage tab. The export will run in the background and, once exported, files will be saved to the S3 bucket. This option supports specifying that the exported files be compressed.

Export Collection with Includes as a File

Export <collection type> with Includes as a File is available under the Export tab. Two formats for export with includes are available, TriG and Zip File.

TopBraid EDG Export to Trig or Zip

TopBraid EDG Export to Trig or Zip

With inferences options will add a dedicated graph named urn:x-topbraid:inferences, which has any triples inferred via SHACL or SPIN rules.

Note

Inferences are computed on-the-fly and therefore the export may be slow.

Publish for Explorer Users

On an EDG server that is paired with a TopBraid Explorer Server (for read-only access), managers can publish it to the Explorer for viewing.

Note

The working copies of all published asset collections might or might not be viewable, depending on the Explorer’s administrative configuration.

Any manager of an asset collection can control its Explorer publication status by selecting Export > Publish Glossary for Explorer Users. The view shows a Status drop-down for the asset collection, which indicates whether the asset collection was ever Published or not (Unpublished).

TopBraid EDG Publish for Explorer Users Status Options

TopBraid EDG Publish for Explorer Users Status Options

TopBraid EDG Publish Included Graphs Options

TopBraid EDG Publish Included Graphs Options

It also lists any included asset collections that might also require publication.

Ensure that all included graphs are either already present on the Explorer server or published along with the asset collection. Changing the status causes the following action.

Current Status

Chosen Option

Result

Unpublished

Published

Sends a copy of the asset collection and selected includes to the Explorer server. Changes the source collection’s status to Published.

Published

Update Published Copy

Re-sends a current copy and selected includes to the Explorer server, overwriting the previous version(s). Keeps the source collection’s status as Published.

Published

Unpublished

Deletes the asset collection on the Explorer server. Changes the source collection’s status to Unpublished.

GraphQL Queries

This options allows users of the collection to retrieve and modify asset collection data using new or saved GraphQL queries.

See GraphQL for details.

Asset Collection Specific Exports

These exports are only available for specific collection types

Export Hierarchy Spreadsheet

Export Hierarchy Spreadsheet is available under the Export tab only for Taxonomies collections. It outputs an entire taxonomy tree in a spreadsheet compatible format.

Export Concept Overview Spreadsheet

Export Concept Overview Spreadsheet is available under the Export tab only for Taxonomies collections. It outputs all taxonomy concepts in a spreadsheet compatible format. For each concept, this output includes every available property as a spreadsheet column.

Export SharePoint Term Store

Export SharePoint Term Store is available under the Export tab only for Taxonomies collections. It updates the SharePoint term store to reflect the current contents of a collection. All terms in the collection are updated.

Export Ontology as RDF Schema File

Export Ontology as RDF Schema File is available under the Export tab only for Ontologies collections. It produces a simple, approximated RDF Schema version of the current ontology in SHACL. The output in in Turtle format.

Export Ontology as OWL File

Export Ontology as OWL File is available under the Export tab only for Ontologies collections. It produces a simple, approximated OWL version of the current ontology in SHACL. The output in in Turtle format.

Export Crosswalk as Spreadsheet

Export Crosswalk as Spreadsheet is available under the Export tab only for Crosswalks collections. It creates a comma-separated spreadsheet containing one row for each mapping in a Crosswalk.

Export Avro JSON

Export Avro JSON is available under the Export tab only for Data Assets collections. It creates one or more Avro files in JSON format for database tables and allows users to select the tables to export.

Normalized Concepts

Normalized Concepts (Troubleshooting) is available under the Export tab only for Content Tagsets collections. It generates a normalized version of the Tagging vocabulary used in a Content Tag Set, as it would be seen by AutoClassifier (useful for troubleshooting). The output is in Turtle format.

Export to S3

Setup

TopBraid EDG Setup Options

TopBraid EDG Setup Options

  1. From Server Administration in EDG, configure your S3 bucket in External System Integration Management.

TopBraid EDG External System Integration Page

TopBraid EDG External System Integration Page

Choose authentication type appropriate for your organization. https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html#credentials-default

Test the connection after saving.

TopBraid EDG Test Connection Button

TopBraid EDG Test Connection Button

  1. In EDG Configuration, select the bucket for the default location for S3 storage for your EDG application and save.

TopBraid EDG S3 Configuration Paraemeters Section

TopBraid EDG S3 Configuration Paraemeters Section

Using the S3 Export

S3 Export uses the Basket feature of EDG.

Add your collection(s) to the Basket:

TopBraid EDG Add Collection to Basket Dialog

TopBraid EDG Add Collection to Basket Dialog

Navigate to the Basket:

TopBraid EDG Navigation Bar with Basket Highlighted

TopBraid EDG Navigation Bar with Basket Highlighted

Check the collections you want to export and select Export to S3:

TopBraid EDG Export to S3 Options Highlighted

TopBraid EDG Export to S3 Options Highlighted

Version will be added from the version metadata in the collection automatically. Change any settings here and then submit:

TopBraid EDG Export to S3 Form

TopBraid EDG Export to S3 Form

The export will run in the background. An Administrator can check on the status through the Server Administration – Scheduled Jobs page. Once the job completes, it is no longer listed on this page. Success or failure is logged to Tomcat logs and also notifications sent (if enabled).

TopBraid EDG Scheduled Jobs Page

TopBraid EDG Scheduled Jobs Page

If email is configured on the EDG application, the user who submits the export will get an email notification when it is complete.

TopBraid EDG S3 Export Complete Email

TopBraid EDG Export Complete Email

The folders are automatically created in S3 for the type of collection that is being exported.

TopBraid EDG S3 Export Folders Page

TopBraid EDG Export Folders Page