Many organizations come to us looking for a search capability to better enable their business and eliminate the data stovepipe syndrome (think isolated, disparate and yet related data). One customer example is a large, public financial institution that we are currently working with to implement a semantic search. Their goal is similar to that of many companies, organizations, agencies, and other closed door environments: they desire a semantic search that enables business users and eliminates data stovepipes. Semantic search in this case becomes multidimensional (I could call it a many legged beast) and presents many challenges:
- Integration, aggregation and similar considerations
- Enrichment techniques and custom domain logic (with the emphasis on custom)
- Structured and unstructured content (often with more than enough of the latter)
- Access and authorization…toss in security markings and ‘need to know’ if that isn’t exciting enough for you
- Data lifecycle and change…automation should come to mind
- Tens of users vs thousands of users (or even more)…each category of users with its own role-based access privileges and other requirements
- User interface and user experience…which often becomes most important
Each bullet listed above has, or deserves to have, books written on the topic. The point that I’m trying to make is that semantic search is a solution that has to be put in place for an organization and not simply a one-size-fits-all algorithm.
With TopBraid Enterprise Data Governance (EDG), I have the tools and capabilities that I need to implement semantic search in such an environment. Starting at its core, EDG is a standards based Semantic Web/Linked Data platform. This allows us to break the solution down into simpler and smaller components (often down to the semantic triple level). |
- RDF is king! The Resource Description Framework (RDF) allows us to construct a common machine understandable language of a domain; down to the smallest and/or most important level of detail
- Meaning and knowledge is encoded in ontologies and SKOS taxonomies
- Web services and pipelines written in SWP (SPARQL Web Pages) retrieve, receive, and transform data (into RDF of course)
- SWP allows for model driven user interfaces that adapt to the dynamically managed data graphs
- Unstructured content is processed and tagged using TopBraid EVN Tagger with AutoClassifier
This isn’t a comprehensive list of TopBraid EDG features but highlights some of the core building blocks of semantic search.
I have also created a quick video on Using the TopBraid EDG Platform to Support Semantic Search. Watch it for an introduction to TopBraidEDG and a demonstration using unstructured, open source economic and financial reports from central banks and other financial institutions around the World.
I’m stoked to be part of the team bringing to life a highly configurable semantic search inside of TopBraid EDG and to support clients wanting to govern data and do search…better! There are lots of good things on the horizon –I ’ll see you there.