Neo4j 2.2 Milestone Release


Better, Faster, and More Scalable Neo4j than ever before

Neo4j 2.2 aims to be our fastest and most scalable release ever. With Neo4j 2.2 our engineering team introduces massive enhancements to the internal architecture resulting in higher performance and scalability. This first milestone (or beta release) pulls all of these new elements together, so that you can “dial it up to 11” with your applications. You can download it here for your testing. Three of the key areas being tackled in this release are:

1. Highly Concurrent Performance

With Neo4j 2.2, we introduce a brand new page cache designed to deliver extreme performance and scalability under highly concurrent workloads. This new page cache helps overcome the limitations imposed by the current IO systems to support larger applications with hundreds of read and/or write IO requirements. The new cache is auto-configured and matches the available memory without the need to tune memory mapped IO settings anymore.

2. Transactional & Batch Write Performance

We have made several enhancements in Neo4j 2.2 to improve both transactional and batch write performance by orders of magnitude under highly concurrent load. Several things are changing to make this happen.
        • First, the 2.2 release improves coordination of commits between Lucene, the graph, and the transaction log, resulting in a much more efficient write channel.
        • Next, the database kernel is enhanced to optimize the flushing of transactions to disk for high number of concurrent write threads. This allows throughput to improve significantly with more write threads since IO costs are spread across transactions. Applications with many small transactions being piped through large numbers (10-100+) of concurrent write threads will experience the greatest improvement.
        • Finally, we have improved and fully integrated the “Superfast Batch Loader”. Introduced in Neo4j 2.1, this utility now supports large scale non-transactional initial loads (of 10M to 10B+ elements) with sustained throughputs around 1M records (node or relationship or property) per second. This seriously fast utility is (unsurprisingly) called neo4j-import, and is accessible from the command line.

3. Cypher Performance

We’re very excited to be releasing the first version of a new Cost-Based Optimizer for Cypher, under development for nearly a year. While Cypher is hands-down the most convenient way to formulate queries, it hasn’t always been as fast as we’d like. Starting with Neo4j 2.2, Cypher will determine the optimal query plan by using statistics about your particular data set. Both the cost-based query planner, and the ability of the database to gather statistics, are new, and we’re very interested in your feedback. Sample queries & data sets are welcome! Despite the strong focus on performance & scalability, we delivered some functional improvements too:

Cypher Profiling

As part of work on the Cypher planner, we extended the profiling output in the neo4j-shell. You can now choose to only EXPLAIN or fully PROFILE a query but just prefixing it with one of the keywords. And you can manually select a query planner with CYPHER 2.2-cost and CYPHER 2.2-rule prefixes.

Neo4j Browser UI

Many small improvements have been made to the UI, including panning, and the ability to kill a running Cypher query. (Query killing is also supported in the Neo4j Shell using CTRL-C.) Please explore these and tell us what you think of them. The Neo4j Browser also handles long running queries more reliably, which is especially important for imports with LOAD CSV.

Basic Authentication

We’ve received requests for a variety of authentication features, and while these are largely planned for the next (Neo4j 2.3) release, we are very pleased to introduce token-based authentication in Neo4j 2.2. This is enabled by default in Neo4j 2.2 M, which means you must either (a) explicitly disable it in conf/neo4j-server.properties (if you have assured security using another means), or (b) change your app to use the authentication token. We default setting in the milestone release is by design to ensure we get your indispensable feedback!

Some things to be aware of

As with all milestones: do not run this in production. This is an early access version to help you prepare for the upcoming production release and provide feedback on the new features. Make sure you refer to the manual for information about new features (especially token-based authentication, neo4j-import, and Cypher optimization and statistics gathering). We are eager to hear your feedback. Please post it to the Neo4j Google Group, or send us a direct email at feedback@neotechnology.com. Enjoy, and please tell us what you discover! Philip Rathle, VP Product on behalf of the Neo4j Team Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today. Download My Ebook