MarkLogic: multi-model NoSQL database

August 09, 2017

Stephen Buxton, Senior Director  Product Management at MarkLogic is excited to hear about the mainstreaming of semantics, made possible by an Enterprise-ready, Multi-model database (MarkLogic Server) and a rich toolset (including PoolParty).

Can you tell something about your work/research focus?

MarkLogic is a multi-model NoSQL database. Real-world data very rarely fits naturally into a rectangle (rows and columns in a table). Rather than trying to model real-world entities as a complex collection of tables with anonymous relationships between them, our customers model entities as documents (stored physically as XML or JSON).

Using the document model, entities can be modeled much more easily and naturally. You can represent hierarchy, repeating fields, and sparse data natively. And you can be as strict or relaxed as you want about the schema – it’s trivial to “just add a column”.

To the document model, add RDF triples. In the relational model tables are related to each other via primary key/foreign keys, which are entirely anonymous. In the multi-model world entities (documents) have explicit, named relationships (triples). Triples also give us ontologies that define a hierarchy of concepts/values, and a hierarchy of locations. So if you want to find a customer named John, your ontology will tell you that “Jon” is a common alternate spelling for “John”, so you should also search for “Jon”; it will also tell you the entities that represent customers, and the fields (elements/properties) in those entities that represent the customer’s first name.

To the document model and the RDF model, add SQL. Some things really do fit neatly into rectangles, such as metadata. You can create an SQL lens over your entities and relationships to do SQL queries, calculate aggregates, or hook up a BI tool such as Tableau.

Which trends and challenges you see for linked data/semantic web and why are they important for MarkLogic?

We see a couple of trends in the linked data/semantic web world.

First, more and more people want to use graphs to create Enterprise-grade applications over business-critical data. For that, they need a triple store with all the Enterprise features you’d expect in a relational database – transactions, backup/restore, replication, failover, security, and so on.

Second, people in the linked data/semantic web are beginning to see that, just as not all data is rectangle-shaped, not all data is suitable for representing as RDF triples either. This has long been the downfall of the semantic web vision – the idea that just because graphs are very good at representing some of your data, you must go all-in and represent all your data as triples.

With an Enterprise-ready, Multi-model database (documents+triples+SQL) people can finally make use of the power of semantics by combining it with documents and SQL.

What are your expectations about Semantics 2017 in Amsterdam, what makes it special for MarkLogic as a company?

MarkLogic has been busy getting the message out about this new way of using semantics – in a Multi-model database – for some time. Semantics 2017 gives us an opportunity to present this idea to a broad audience representing the cream of the semantics community.
Semantics 2017 is also about learning what others are doing with semantics, in terms of the latest research, and the latest tools, and in terms of creative end-user deployments.