During my career I’ve been back and forth between software product companies (from the very large to startups), and end-user companies (from manufacturing to banking), always with a focus on transaction processing, data management and distributed computing. I was closely involved in the burst of thinking in the early-mid 2000s about flexible-outcome, controlled-visibility nested transactions (Business Transaction Protocol, WS-BA).
Neo4j attracted my attention while I was helping to design a data integration platform created by a 30-person “internal product” team that I had brought together at Barclays Bank with CIO sponsorship from 2011 onwards.
In the Barclays project we built an API for publishing standardised snapshot data for end-of-day processing, and a streaming query processor that kicked off as data began to be produced and spat out customised inputs for tens of downstream systems.
Lots of hard technical problems (like inter-continental WAN streaming, parallel production and consumption, and multi-language support) but two areas stand out for me.
One was the querying of data that went beyond the relational model: the designers produced a rather clean, intuitive integration of SQL with XPath (and some prototypical thinking about JSON querying/data extraction), to reflect complex nested, document data. The other was the “project we never really got to”: how to describe, manage and track the dependency graph of processes and data elements that was involved in the interactions of hundreds of systems, with thousands of data feeds, at the end of every business day. That graph was just a part of the bigger graph of enterprise systems and business processes.
So: graph-like problems plus post-relational data querying were two key things which brought me to Neo4j. When I joined Neo Technology last year I was thrilled to have the opportunity to act as product manager for the openCypher project and also to be invited to join the Cypher Language Group.
In late 2015, Neo Technology announced an initiative to make Cypher – a declarative language for graph querying (both read and insert/update) – into a living and evolving industry standard. We want the graph technology market to grow absolutely, and we want to see the same gains for tooling, BI, graph visualization and uniform skillsets that SQL has brought to relational DBMS. We’d like to see Cypher become the “SQL of Graphs”.
Since then more and more property graph database or graph querying features have been appearing from young startups like Dgraph or Bitnine and unicorn-grade companies like Datastax and MongoDB. In the last half year we’ve seen increasing numbers of plans or products from big established database vendors like SAP (HANA Graph), Microsoft, IBM and Oracle.
In the meantime, the openCypher engineers at Neo Technology have been producing grammar specifications in EBNF and ANTLR form, and a comprehensive Technology Compatibility Kit (TCK) using hundreds of Cucumber scenarios to define conformance with the Cypher language. Tools vendors (like Neueda, who have an increasingly popular IntelliJ IDEA Cypher language plugin), stimulating research projects and more and more commercial vendors are all beginning to use Cypher in earnest, aided by the openCypher tooling.
SAP HANA Graph’s December 2016 announcement of Cypher support is a good example of Cypher’s maturation as a well-known, powerful and productive declarative language in the graph data world. For business users, developers, technology buyers, graph database and graph analytics vendors, this is an important process.
Which is why we at Neo Technology are very pleased to have organised the first openCypher Implementers Meeting (oCIM), taking place next week, on 8 February (the day before the Linked Data Benchmark Council [LDBC] Technical User Committee [TUC]), at SAP’s global HQ in Walldorf, in western Germany. The meeting is for companies and developer or research groups working actively on, or thinking about using, Cypher in their running code and products.
You can find the first oCIM agenda right here. We will have attendance and speakers from quite a few organisations for a packed programme of talks, showcases and discussions. We’re anticipating a permanent openCypher Implementers Group coming out of the meeting to carry forward language evolution on a broad, industry-consensus basis, as a stepping stone to a more formal open standard language.
You can see the topics in the programme linked above: it’s a moment where some very important language expansions are being considered.
It’s hard to choose, but for me the two biggest oCIM highlights are the beginning of work on formal, mathematical specification of Cypher, and the whole area of multiple graph processing.
When Cypher becomes composable (so queries can take in multiple, named graph inputs and output graphs) then all kinds of possibilities open up, from views (which are key to fine-grained data access control), multi-database/multi-tenancy, right through to multi-engine graph processing pipelines (Neo4j OLTP to Apache Spark OLAP and back again, for example).
A big welcome to all the openCypher implementers who will be joining us at Walldorf next week! I’m looking forward to posts on the results of the Implementers Meeting in ten days or so, and there should be some interesting presentation slides emerging straight after the meeting has finished.
Get My Free Copy