Chapter 8. Mapping graph models

Mappings can be used for applying transformations to the RDF as it’s imported into Neo4j and they can also be used to transform the vocabulary used in a Neo4j graph as it’s exported through the different RDF export methods described in Chapter 7, Exporting RDF data. Mappings are based on terminology but they will not modify the structure of the graph. In other words, as we will see in this section, you will be able to use them to rename a property, a relationship or a label but not to change a property into a relationship.

8.1. Public Vocabularies/Ontologies

A public graph model is also called an Ontology (or a schema, or a vocabulary). We will not go into the details of the subtle differences between each flavour in this manual. All we need to know is that a graph model normally defines a set of terms (categories, properties, relationships…​) and how they relate to each other. Some common examples are, FIBO or the Gene Ontology. Public vocabularies like the ones mentioned, typically uniquely identify the terms in it by using namespaces, so roughly speaking, a namespace identifies a vocabulary (or at least a portion of it).

To create a mapping with NSMNTX we need to do two things: first, create a reference to a public schema, and then use that reference to create individual mappings from elements in the Neo4j schema to elements in the public schema. Here’s how to do it:

8.2. Defining mappings

Let’s say we want to map our movie database schema to the public vocabulary. First we’ll create a reference to the vocabulary passing the base URI and a prefix to be used in the RDF serialisation. You can use standard ones or a random accronym. Just make sure they’re both unique in your mapping definition.

call semantics.mapping.addSchema("","sch")

The call to create a reference to a public vocabulary will produce as output, the newly created reference, or alternatively an error message indicating what went wrong:

│"prefix"│"namespace"         │
│"sch"   │""│

We can create as many references to public vocabularies as needed, and there is also useful method (mapping.addCommonSchemas) that can be used to include a set of the most common schemas in one go:

call semantics.mapping.addCommonSchemas()

References to schemas can be removed using the mapping.dropSchema method and passing as single parameter the exact URI of the vocabulary we want to have deleted. Notice that this will remove both the schema and all element mappings defined on it.

call semantics.mapping.dropSchema("")

And we can list the currently existing schemas by running mapping.listSchemas. This method takes an optional string parameter that can be used to filter the list to the ones that match a particular search string in the schema uri or in the prefix. If we run the following after running the mapping.addCommonSchemas:

call semantics.mapping.listSchemas("rdf")

We would get the following results:

│"prefix"│"namespace"                                  │
│"rdfs"  │""      │
│"rdf"   │""│

Once we have defined a reference to a public vocabulary/schema, we can now create actual mappings for elements in our graph to elements in the public schemas. The mapping.addMappingToSchema procedure. This method takes three parameters, the URI of a public schema previously added via mapping.addSchema and a pair formed by the name of the element in our graph (a property name, a label or a relationship type) and the matching element in the public schema.

The following example shows how to define a map from a CHILD_CATEGORY relationship type in a Neo4j graph to the skos:narrower relationship (or ObjectProperty in RDF terminology).

call semantics.mapping.addMappingToSchema("", "CHILD_CATEGORY", "narrower")

Just like we did with schema references, we can list existing mappings using mapping.listMappings and filter the list with an optional search string parameter to return only mappings where either the graph element name or the schema element name match the search string.

call semantics.mapping.listMappings()

Producing a listing with the following structure:

│"schemaNs"                            │"schemaPrefix"│"schemaElement"│"elemName"      │
│""│"skos"        │"narrower"     │"CHILD_CATEGORY"│

It is also possible to remove individual ones with mapping.dropMapping passing as single parameter the name of the graph model element on which the mapping is defined.

call semantics.mapping.dropMapping("CHILD_CATEGORY")

8.3. Mappings for export

Let’s look at the case where we want to publish a graph in Neo4j but we want to map it to our organisation’s canonical model, our Enterprise Ontology or any public vocabulary. For this example we’re going to use the Northwind database in Neo4j :play northwind-graph and the public vocabulary.

Here’s the script that defines the reference to the public vocabulary and a few individual mappings for elements in the Northwind database in Neo4j.

//set parameter uri ->   :param uri: ""

CALL semantics.mapping.addSchema($uri,"sch");
CALL semantics.mapping.addMappingToSchema($uri,"Order","Order");
CALL semantics.mapping.addMappingToSchema($uri,"orderID","orderNumber");
CALL semantics.mapping.addMappingToSchema($uri,"orderDate","orderDate");

CALL semantics.mapping.addMappingToSchema($uri,"ORDERS","orderedItem");

CALL semantics.mapping.addMappingToSchema($uri,"Product","Product");
CALL semantics.mapping.addMappingToSchema($uri,"productID","productID");
CALL semantics.mapping.addMappingToSchema($uri,"productName","name");

CALL semantics.mapping.addMappingToSchema($uri,"PART_OF","category");

CALL semantics.mapping.addMappingToSchema($uri,"categoryName","name");

After running the previous script, we can check that the mappings have been correctly defined with

call semantics.mapping.listMappings()

That should return:

│"schemaNs"          │"schemaPrefix"│"schemaElement"│"elemName"    │
│""│"sch"         │"Order"        │"Order"       │
│""│"sch"         │"orderNumber"  │"orderID"     │
│""│"sch"         │"orderDate"    │"orderDate"   │
│""│"sch"         │"orderedItem"  │"ORDERS"      │
│""│"sch"         │"Product"      │"Product"     │
│""│"sch"         │"productID"    │"productID"   │
│""│"sch"         │"name"         │"productName" │
│""│"sch"         │"category"     │"PART_OF"     │
│""│"sch"         │"name"         │"categoryName"│

Now we can see these mappings in action by running any of the RDF generating methods described in Chapter 7, Exporting RDF data (/describe/id, /describe/find/ or /cypher). Let’s use the /cypher method to serialise as RDF an order given its orderID.

:POST /rdf/cypher
{ "cypher" : "MATCH path = (n:Order { orderID : '10785'})-[:ORDERS]->()-[:PART_OF]->(:Category { categoryName : 'Beverages'}) RETURN path " , "format": "RDF/XML" , "mappedElemsOnly" : true }

The Cypher query uses the elements in the Neo4j graph but the generated RDF uses vocabulary elements. The mapping we just defined is bridging the two. Note that the mapping is completely dynamic which means that any change to the mapping definition will be applied to any subsequent request.

Elements for which no mapping has been defined will use the default Neo4j schema but we can specify that only mapped elements are to be exported by setting the mappedElemsOnly parameter to true in the request.

Here’s the output generated by the previous request:

<?xml version="1.0" encoding="UTF-8"?>

<rdf:Description rdf:about="neo4j://com.neo4j/indiv#786">
	<rdf:type rdf:resource=""/>
	<sch:orderDate>1997-12-18 00:00:00.000</sch:orderDate>

<rdf:Description rdf:about="neo4j://com.neo4j/indiv#74">
	<rdf:type rdf:resource=""/>
	<sch:name>Rhönbräu Klosterbier</sch:name>

<rdf:Description rdf:about="neo4j://com.neo4j/indiv#80">

<rdf:Description rdf:about="neo4j://com.neo4j/indiv#786">
	<sch:orderedItem rdf:resource="neo4j://com.neo4j/indiv#74"/>

<rdf:Description rdf:about="neo4j://com.neo4j/indiv#74">
	<sch:category rdf:resource="neo4j://com.neo4j/indiv#80"/>


There’s another example of use of mappings for export in this blog post.

8.4. Mappings for import

In this section we’ll see how to use mappings to apply changes to an RDF dataset on ingestion using the RDF import procedures described in Chapter 3, Importing RDF data.

Let’s say we are importing into Neo4j the the Open PermID dataset from Thomson Reuters. Here is a small fragment of the 'Person' file:

@prefix vcard: <> .
@prefix xsd: <> .
@prefix permid: <> .

  a vcard:Person ;
  vcard:given-name "Keith"^^xsd:string .

  vcard:family-name "Peltz"^^xsd:string ;
  vcard:given-name "Maxwell"^^xsd:string ;
  vcard:additional-name "S"^^xsd:string ;
  a vcard:Person .

  vcard:family-name "Benner"^^xsd:string ;
  vcard:given-name "Thomas"^^xsd:string ;
  a vcard:Person ;
  vcard:friend-of <> .

As part of the import process, we want to drop the namespaces (as described in Chapter 3, Importing RDF data, this can be done using the handleVocabUris: "IGNORE" configuration) BUT in this case, we also want to create more neo4j-friendly names for properties. We want to get rid of the dashes in property names like given-name or additional-name and use 'camelCase' notation instead. The way to tell NSMNTX to do that is by defining a model mapping and setting the handleVocabUris parameter on import to 'MAP'.

We’ll start by defining a mapping like the one we defined for exporting RDF. Note that the properties we want to map are all in the same vcard vocabulary: The following script should do the job:

[{ neoSchemaElem : "givenName", publicSchemaElem:	"given-name" },
{ neoSchemaElem : "familyName", publicSchemaElem: "family-name" },
{ neoSchemaElem : "additionalName", publicSchemaElem: "additional-name" },
{ neoSchemaElem : "FRIEND_OF", publicSchemaElem: "friend-of" }] AS mappings,
"" AS vcardUri

CALL semantics.mapping.addSchema(vcardUri,"vcard") YIELD namespace
UNWIND mappings as m
CALL semantics.mapping.addMappingToSchema(vcardUri,m.neoSchemaElem,m.publicSchemaElem) YIELD schemaElement
RETURN count(schemaElement) AS mappingsDefined

Just like we did in the previous section, we define a vocabulary with mapping.addSchema and then we add individual mappings for elements in the vocabulary with mapping.addMappingToSchema. If there were multiple vocabularies to map, we would just need repeat the process for each of them.

Now we can check that the mappings are correctly defined by running:

CALL semantics.mapping.listMappings()

which produces:

│"schemaNs"                        │"schemaPrefix"│"schemaElement"  │"elemName"      │
│""│"vcard"       │"given-name"     │"givenName"     │
│""│"vcard"       │"family-name"    │"familyName"    │
│""│"vcard"       │"additional-name"│"additionalName"│
│""│"vcard"       │"friend-of"      │"FRIEND_OF"     │

Important to note that when using the option handleVocabUris: "MAP", all non-mapped vocabulary elements will get the default treatment they get when the 'IGNORE' option is selected.

Once the mappings are defined, we can run the import process as described in Chapter 3, Importing RDF data with the mentioned config param handleVocabUris: 'MAP' as follows:

CALL semantics.previewRDF("","Turtle", { handleVocabUris: 'MAP' })

After data load, we will be able to query the imported graph with a much more friendly cypher:

MATCH (n:Person) RETURN n.uri AS uri, n.familyName as familyName LIMIT 10

to get:

│"uri"                             │"familyName"  │
│""│null          │
│""│"Benner"      │
│""│"Peltz"       │

The combination of a mapping definition plus the use of the handleVocabUris: 'MAP' configuration can be applied not only to the semantics.importRDF procedure but also to the preview ones semantics.previewRDF and semantics.previewRDFSnippet.