Introduction
LARUS Business Automation has been the first Neo4j Italian partner since 2013, but let’s first take a step back to winter 2010 to talk about our first major projects with Neo4j.
At the time, we were working for one of the biggest Italian retailers, and we had the honour to meet Jim Webber to talk about his “Guerilla SOA” architecture, fascinated by his book REST in Practice (written with Ian Robinson – Director of Customer Success at Neo Technology – and Savas Parastatidis as co-authors), and the idea of a lightweight architecture that could avoid the need for an ESB. A few months later, Jim took the position of Chief Scientist with Neo Technology, so we started experimenting with something that sounded like magic: a graph database!
Our first major involvement with Neo4j was about modelling articles’ hierarchies and clusters in a graph in order to improve very slow recursive queries in the articles table of the existing relational database. Despite the fact that Neo4j was still in the 1.x series (i.e., without labels and all the great features that the current Neo4j version brings with itself), the results were amazing.
We were entering in the world of graphs and a couple of years later LARUS officially became the first Neo4j partner in Italy.
Thanks to Stefan Armbruster, we were able to become a Neo4j certified trainer, and thanks to Michael Hunger, we ran our first talk on Neo4j at an international conference. We owe a debt of gratitude to both of these friends.
Last year, our Neo4j’s evangelization journey in the Italian territory resulted in the implementation of three important proof-of-concept projects (POCs). The first was at Barclays around fraud detection – one of the most important use cases for Neo4j. The second was for Veneto Bank, where Neo4j has been adopted to discover, capture and make sense of complex communication channels between customer/internal applications and external systems in order to clearly determine the relationships and dependencies between all these components and run IT governance effectively (Veneto Bank will also be presenting at GraphConnect Europe!).
Finally, the third POC was a project with the Catholic University of Milan for the digitization of the university library that uses Neo4j as recommendation and graph-based search engines.
These experiences has led us, in this beginning of 2016, to deliver the first international consulting services: More and more companies are adopting Neo4j, and we are happy to help them by validating the graph models they designed, the correct configuration for production environments and tuning Cypher queries written by their teams.
As always, earlier this year we met to discuss about what we’d have liked to present at our favorite event: GraphConnect Europe. As LARUS is establishing itself as authority in Neo4j integration topics, next April 26 we’ll be happy to introduce to all the GraphConnect participants the following two integration projects:
- A bidirectional integration API between Neo4j and Couchbase
- A brand-new Neo4j 3.x JDBC Driver with Bolt
The New Neo4j-Couchbase Connector
Moving from a pure relational database persistence solution to a polyglot persistence approach is the key to solving dedicated use cases effectively.
Each database handles its chore best (e.g., data volume or complexity), and when combined appropriately a combination of databases can provide the best of both worlds for application development productivity and efficient large-scale data management.
While working on many integration projects, at LARUS we see the need for well-engineered connectors, because combining different databases means you need a working data-synchronization and communication solution, so we started implementing a brand-new Neo4j-Couchbase Connector: an API that provides a sound and performant bi-directional integration between Neo4j and Couchbase to map selected documents and their updates to normalized graph structures in the first direction and also sends graph updates to be integrated into documents in the opposite one.
The Neo4j-Couchbase Connector consists of four components, sketched in the architectural diagram below:
Where:
- The “Neo4j JSON loader” is responsible for storing JSON documents into a configurable normalized Neo4j graph structure, and it’s implemented as an independent Neo4j server extension;
- The “Couchbase Mutation Listener” is responsible for detecting newly created, updated and/or deleted documents in Couchbase and retrieving their JSON representation to be sent to the “Neo4j JSON Loader”;
- The “Couchbase JSON Loader” is responsible for storing and/or updating JSON document in Couchbase;
- The “Neo4j Mutation Listener” is responsible for detecting newly created, updated and/or deleted nodes and relationships in Neo4j and retrieving their own properties and connections to be sent to the “Couchbase JSON Loader”. This component is implemented as Neo4j server extension too.
Person: { "firstname" : "Lorenzo", "lastname" : "Speranzoni", "age" : 41, "job" : { "role" : "CEO", "company" : { "name" : "LARUS Business Automation", "vat" : "03540680273", "address" : { "street" : "Via B. Maderna, 7", "zipCode" : 30174, "city" : "Mestre", "province" : "Venice", "country" : "Italy", "type" : "Address" }, "type" : "Company" }, "type" : "Job" }, "type": "Person" }
combined with the following domain description:
JsonObjectDescriptor addressObjectDescriptor = new JsonObjectDescriptor("Address", Arrays.asList("street", "zipCode"), "type"); JsonObjectDescriptor companyObjectDescriptor = new JsonObjectDescriptor("Company", Arrays.asList("vat"), "type"); JsonObjectDescriptor jobObjectDescriptor = new JsonObjectDescriptor("Job" , Arrays.asList("role"), "type"); JsonObjectDescriptor personObjectDescriptor = new JsonObjectDescriptor("Person" , Arrays.asList("firstname", "lastname"), "type");
would be translated into this Cypher statement:
MERGE (person:Person { firstname: 'Lorenzo', lastname: 'Speranzoni' }) ON CREATE SET person.couchbaseId = '1234567890QWERTY', person.age = 41 MERGE (job:Job { role: 'CEO' }) ON CREATE SET job.couchbaseId = '1234567890QWERTY', job.type = 'Job' MERGE (company:Company { vat: '03540680273' }) ON CREATE SET company.couchbaseId = '1234567890QWERTY', company.name = 'LARUS Business Automation' MERGE (address:Address { street: 'Via B. Maderna, 7', zipCode: 30174 }) ON CREATE SET address.couchbaseId = '1234567890QWERTY', address.city = 'Mestre', address.province = 'Venice', address.country = 'Italy' MERGE (company)-[:COMPANY_ADDRESS]->(address) MERGE (job)-[:JOB_COMPANY]->(company) MERGE (person)-[:PERSON_JOB]->(job)
The New Neo4j-JDBC Driver
At LARUS, we first imagined the new version of the Neo4j JDBC Driver as a solid, multi-protocol efficient driver. Something stable that people would love to use.
The current driver architecture provides different modules for different protocols. For example, if we want to connect to our database using the new JDBC driver, through the brand new Bolt protocol, we will only need the Bolt module, not the whole package. This provides a light dependency, without any need of including the other modules on our projects.
Of course, having a dedicated protocol used to connect to a graph database is great. One of our goals was to build a driver that would maintain expected performances, without creating any bottleneck on an application. For example, the Bolt module was developed on top of the brand new Java Bolt driver, reusing everything possible and trying to reduce useless cycles and mappings.
What’s something that really makes the difference between Neo4j and other NoSQL databases? ACID. In Larus, we focused on it, developing transactions to best fit the common usage.
The driver will both provide standard auto commit mode and manual commit mode. The manual commit mode can perform multiple statements on the same transaction. The new JDBC driver also supports new Neo4j 3.0 features, like procedures.
Using some stable and working software should be a standard when we’re developing our projects, including third-party developed dependencies to help us reaching our goals. So we wanted a tested driver: a suite of tests that can make us comfortable while using it that makes us feel it is “solid”. Unit testing, integration tests and performance tests were designed to be fully automated.
At LARUS, we love Spring Data Neo4j (SDN), made by our friends at GraphAware. We use it on our Spring projects because it’s easy to use and provides us a set of basic operations without writing any lines of Java code.
But we know that we can’t always use Spring on our projects. Or even more, what if we would like to connect the new Neo4j 3.0, including the provided Bolt protocol, with Pentaho? Or creating some Jasper Report, using Jasper Studio, with all of our data inside Neo4j?
The JDBC driver is the common integration tool, for connecting our favourite graph database with all the common suites that uses this standard to persist and retrieve data from any kind of database.
It’s easy to use, like every JDBC driver; here’s an example:
// Make sure Neo4j Driver is registered Class.forName("BoltDriver"); // Connect Connection con = DriverManager.getConnection("jdbc:bolt://localhost:7687"); // Querying try (Statement stmt = con.createStatement()) { ResultSet rs = stmt.executeQuery("MATCH (n:User) RETURN n.name"); while (rs.next()) { System.out.println(rs.getString("n.name")); } } con.close();
Source code is available here:
Conclusion
I will talk about these two projects in our lightning talk “Enterprise Data Integration with a new JDBC Driver for Neo4j 3.0“ at 2:40 p.m. on 26 April 2016 at GraphConnect Europe.
Also at the lightning talk will be Marco, Alberto, Riccardo and Mauro, the LARUS senior consultants that have been working on the development of such projects under the leadership of Neo Technology. Don’t miss the opportunity to meet and chat with them!
Editor’s Note: Larus Business Automation is a Bronze Sponsor of GraphConnect Europe. Click below to register for GraphConnect and meet Lorenzo and the rest of the Larus team in London on 26 April 2016.