We are excited about a host of great new features, all ready to be used. Let’s get to it.
HighlightsWhat features have been included in this release?
- Cloud – Public beta on Heroku of the Neo4j Add-on
- Cypher – Supports older Cypher versions, better pattern matching, better performance, improved api
- Web admin – Full Neo4j Shell commands, including versioned Cypher syntax.
- Kernel – Improvements, for instance the ability to ensure that key-value pairs for entities are unique.
- Lucene upgrade – Now version 3.5.
Also, there have been many improvements behind-the-scenes:
Infrastructure – Our library repositories have moved to Amazon, providing significantly faster download times.
Quality – High availability features better logging and operational support.
Process – Better handling of breaking changes in our api and how we handle deprecated features.
If you want more info on all of this – sure you do – please keep reading. Here is a run down of the major new features in Neo4j 1.6.
Heroku Public BetaThe public beta of the Neo4j Add-on for Heroku is available. We’re taking a careful approach with our cloud services, evaluating the best supporting infrastructure and user experience in preparation for a general release in the coming months. Already, we’ve been pleased with the positive response.
Documentation on how to get started with the Heroku Neo4j Add-on can be found at the Heroku DevCenter. We’ll be posting additional guides for getting started on Heroku with Neo4j.
For pioneering adopters, we welcome you to join our Neo4j Heroku Challenge. You can win fabulous prizes while proudly blazing a path into the cloud for our community.
Latest on CypherMost the work in Cypher for this release has been internal changes that are not immediately visible to an end user. The type system has been rebuilt and revamped, and a second, simpler, pattern matcher has been added. The first change makes the Cypher code base faster to work with, and the second makes your queries faster.
End user facing changes include: possibility to get all shortest paths, the COALESCE function, column aliasing<, and the possibility for variable length relationships to introduce an iterable of the relationships.
Finally, there are two breaking changes – the syntax for the ALL/NONE/ANY/SINGLE predicates has changed, and the ExecutionResult is now a read-once, forward only iterable.
New on the web admin
I’m quite happy to announce that the web admin interface has initial support for Cypher calls directly in the data browser. It’s so sweet to be able to query your way around the node space! And, the Cypher console is now supports full Neo4j Shell commands.
Moreover, Gremlin has been updated to version 1.4, with major improvements and bug fixes.
Kernel changesThis release includes a popular feature request: the ability to ensure that key-value pairs for entities are unique!
If you look up entities (nodes or relationships) using an external key, you’ll want exactly one entity to correspond to each value of the key. For example, if you have nodes representing people, and you look these up using Social Security Number (SSN), you’ll want exactly one node for each SSN. This is easily achieved if you load all your data sequentially, because you can add a new node each time you meet a value of the key (a new SSN). However, up to now, it has been awkward to maintain this uniqueness when multiple processes are adding data simultaneously (via web requests for example).
Since this is a common use-case, we’ve improved the API to make it easy to enforce entity uniqueness for a given key-value pair. At the index level, we’ve added a new method putIfAbsent which ensures that only one entity will indexed for the key-value pair, even if lots of threads are using the same key-value pair at the same time. Alternatively, if you’d prefer to work with nodes or relationships rather than with the underlying indexes, there’s a higher level API provided by UniqueFactory. This makes it easy to retrieve an entity using get-or-create semantics, i.e. it returns a matching entity if one exists, otherwise it creates one. Again, this mechanism is thread-safe, so it doesn’t matter how many threads call getOrCreate simultaneously, only one entity will be created for each key-value pair. This functionality is also exposed through the REST API, via a
Lucene upgradeNeo4j uses Apache Lucene as the default implementation for its indexing features – this allows you to find “entry points” into the graph before starting graph-based queries. Lucene is an actively developed project in its own right, and is constantly being enhanced and improved. In this Neo4j release, we’re taking the opportunity to upgrade to a newer stable release of Apache Lucene, so that all users get the benefits of recent enhancements in Lucene. We’ve moved to Lucene 3.5; for details on all the changes, have a look at their changelog.
Breaking changes and deprecatingWe’re introducing a new way to handle breaking changes. They will be flagged in the change logs as “BREAKING CHANGE.”
Where we do introduce a breaking change, we will continue to support the older functionality for 2 GA releases. This would typically be six months heads up and will allow you to adopt new GA releases quickly while giving plenty of time to develop against the new API. This policy applies to published and stable APIs, including Cypher.
In the same vein: We now have a deprecated feature. Cypher execution is now part of the core REST API, the cypher plugin is deprecated.
This policy does not cover third-party add-ons (like Gremlin from Tinkerpop) which have their own release strategy.
Looking ForwardCommunity member Pablo Pareja Tobes had organized a poll around feature requests, which really helps us prioritize our development focus. Thanks everyone making their voice heard!
Here are the results:
Filter relationships natively by their name (supernodes issue)
Sharding and horizontal scalability
Mandatory node types
Node insertion with checking of uniq external (get_or_create)
ShardingThe write-scaling complement to high-availability, sharding distributes a graph across multiple machines in a cluster. We (and many others) have researched the general graph sharding problem for years. This year, we’re embarking upon a pragmatic approach to sharding, providing the benefit without obsessing about academic perfection.
SupernodesIn Twitter-culture, you’d call these the “Ashton Kutcher” nodes, the nodes in a graph with an extreme number of connections. We’ve been working on a branch that has a promising approach for mitigating the performance challenge of traversing these supernodes.
Node typesIn Neo4j, there is no schema, only structure. Relationships indicate the effective type of the connected Nodes, and Indexes imply membership in a set. Often, though, it would be helpful to know the designated type of a Node. So, we’re considering the appropriate way to introduce just enough schema. If you have any thoughts or desires to share, please chime in on the issue page.
Unique indexingIndexes provide a quick look-up for sets of Nodes or Relationships. With unique indexes, Neo4j will guarantee that only one Node is mapped to a property key, providing support for domain-specific identifiers. This new feature is available now with 1.6GA.
N-ary relationshipsNeo4j’s property graph model restricts a relationship to connecting two nodes. In some domains, it is useful to consider relationships having multiple end-points. For now, we think this is best solved with domain-specific solutions.
Fixes and detailsOf course, this release includes a slew of bug fixes. For details about all the fixes and additions please read the various
CHANGES.txtfiles included in the packaging.
Also, an impressive array of community-contributed development has been included in this release. Thank you all for the good ideas and pull requests – everyone is really appreciating it!
Go for itYour feedback is of great value and we would love for you to join our community mailing list.
The Neo4j 1.6 is ready – download now and get involved!
Björn Granvik et al
Director of Engineering @ Neo Technology