A Big Step Forward: Spring Data Neo4j 5.0 Release


Learn all about the new Spring Data Neo4j 5.0 release and the Object Graph Mapping (OGM) 3.0 releaseThis post introduces what’s been happening in Spring Data Neo4j recently.

You will learn about the new features and the simplification of the programming model, find out what has changed under the cover, such as smarter querying for better performance. You will also read about the upcoming features. Don’t miss the opportunity to fill in our survey to bring your ideas in the roadmap!

For over half a year the SDN/OGM team has been working on new features in Neo4j Object Graph Mapping library (OGM) and Spring Data Neo4j (SDN) framework. We are happy to announce the release of OGM 3.0 and SDN 5.0.

We’re also welcoming Gerrit Meier to the Spring Data Neo4j team within Neo4j engineering, who besides developing and improving the libraries will also work closely with the Pivotal team and our users, customers and field engineers to provide an even better experience for you.

If you attended GraphConnect New York you might have seen the “Building Microservices with Spring Cloud & Spring Boot” training (with a section on Spring Data Neo4j 5.0) by Kenny Bastani.

The “Spring Data Neo4j 5.0” talk by Nicolas and Gerrit was repeated as Neo4j Online Meetup, that you can watch below:

For those Java developers who are not familiar with it, Neo4j-OGM is an object-graph mapping library for Neo4j, optimised for server-based installations utilising Cypher. It aims to simplify development with Neo4j, and similar to JPA or Hibernate, it uses annotations on simple POJO domain objects to provide mapping metadata.

Spring Data Neo4j is a Spring Data project for Neo4j. It uses Neo4j-OGM under the hood (very much like Spring Data JPA uses JPA) and provides functionality known from the Spring Data world, like repositories, derived finders or auditing.

New in Neo4j-OGM 3.0


There are many new features in both OGM and SDN. Note that all new OGM features are automatically available for SDN users as well, so even if you use SDN directly, or through Spring Boot, you should read this section.

Dynamic properties

Dynamic properties was one of the most requested features. This allows you to take advantage of the fact that Neo4j is a database with an optional schema and can map arbitrary properties on entities in the graph to a Map field in your node or relationship entity (and vice versa).

To use this feature simply annotate a Map with new @Properties annotation. The allowed keys in the map are either String or Enum, the values may be any types defined by the Cypher query language. The property keys are constructed by concatenating the field name (or prefix), the delimiter (“.” by default) and the key in the map.

@NodeEntity
public class User {

   @Properties(prefix = "custom")
   private Map userProperties;
  
}

A Map with a key of note and a value A note about user would be mapped to a node property custom.note with the value A note about user. See the documentation for more details.

Schema-Based Loading

One of the fundamental design decisions in OGM, which aims to reduce the number of queries, is that an entity and all relationships up to certain degree are loaded using a single query. This is very cheap in Neo4j thanks to its index-free adjacency (compared to relational databases, where each level would mean another JOIN). Starting with the previous version, SDN4, when an entity was loaded up to certain depth it was queried by following pattern:

...
MATCH p=(n)-[*0..1]-()
RETURN p

This works very well when the domain classes match your graph 1:1, but inefficiencies may arise in certain situations, resulting in loading more data than needed:
    • Direction of relationships is ignored
    • Relationships and nodes not in the class model are loaded anyway
    • Nodes at the beginning of the path are returned multiple times
That was especially problematic with higher depths and when a node had a large number of irrelevant (for our POJO) relationships that were fetched and then thrown away.

We have now switched to a new load strategy based on a schema derived automatically from class metadata. It uses nested pattern comprehensions generated from the schema. The final part of load queries now look usually like this:

...
RETURN n,[ 
[ (n)-[r_f1:`FOUNDED`]->(o1:`Organisation`) | [ r_f1, o1 ] ], 
[ (n)-[r_e1:`EMPLOYED_BY`]->(o1:`Organisation`) | [ r_e1, o1 ] ]
]

Querying a Person node with two different relationships to Organisation.

This has several advantages:
    • Only data which will be mapped are fetched
    • It respects direction and type of relationship in the class metadata
    • Avoids fetching duplicates (it doesn’t completely eliminate them when there are cycles though)
    • It nests over multiple levels of fetching
While we have extensive test coverage for various entity models (inheritance hierarchies, generic entities, etc.), as a preemptive workaround for any issues there is a way to fall back to the old style of querying by (for a single session):

session.setLoadStrategy(LoadStrategy.PATH_LOAD_STRATEGY);

or globally:

sessionFactory.setLoadStrategy(LoadStrategy.PATH_LOAD_STRATEGY);

Also note that it is not possible to query for unlimited depth (depth = -1) with the new load strategy (which was a dangerous option in the first place).

This is a first step, which makes further enhancements possible – like loading only relationships you are interested in, loading only some, not all node/relationship properties – these will come in future versions.

New Id Management

In Spring Data Neo4j 4, a mandatory specific field for entity id had to be used (of type java.lang.Long, either named id, or annotated with the @GraphId annotation). If an application wanted to lookup entities by another attribute, it had to add a primary index via @Index(primary = true, unique = true).

While an id is still required on all entities, the behavior has been simplified by introducing the new @Id annotation. It replaces both @GraphId and the primary attribute and can be placed on any attribute with a simple type.

The generation of this id attribute is also now customizable. Out of the box, two strategies are provided: UUID generation and ids generated by Neo4j. Applications can also provide a custom generation strategy if needed.

These id’s will be used for looking up entities and for MERGE on the id field on save.

Field Access Only

Entity fields are now read from and written directly by the object mapper, not through annotated or derived setters.

This has been a major source of confusion in the past, and we hope this change makes the mapping easier. See the migration guide for more details.

Configuration Changes and Improvements

A new builder configuration class was introduced as a single place for configuring the OGM via properties. New constructors for SessionFactory and all drivers were added to allow creation of fully customised drivers. Also the driver class name is derived automatically from provided URI.

This provides two main benefits:
    • If you need to configure a setting not handled by OGM, you don’t need to wait for us to make it available
    • Custom configuration of embedded database for tests (with procedures or other settings, like enabling shell, etc.)
Example configuration using builder:

Configuration configuration = new Configuration.Builder()
        .uri("bolt://neo4j:password@localhost")
        .setConnectionPoolSize(150)
        .build()

Example configuration providing custom driver:

EmbeddedDriver embeddedDriver = new EmbeddedDriver(graphDatabaseService);
sessionFactory = new SessionFactory(embeddedDriver, packages);

Internal Changes

Classpath scanning

Our own, custom classpath scanning and bytecode parsing has been dropped in favour of FastClasspathScanner and standard reflection. Performance tests show roughly same performance and we should provide better compatibility with environments with unusual classloader hierarchy.

New in Spring Data Neo4j 5.0


Spring Framework 5.0 and Java 8

Spring Data Neo4j 5.0 is built on top of Java 8, the new Spring Framework 5.0 and Spring Data 2.0. Using these foundations allows us to benefit from the latest enhancements in the Spring world.

Better Causal Cluster Support

In Neo4j, bookmarks are used to achieve causal consistency. Bookmark management in Spring Data Neo4j automates handling of transaction-bookmarks for Neo4j clusters (read your own writes) within an application.

Once it is enabled through @EnableBookmarkManagement it stores all bookmarks of finished transactions to an instance of BookmarkManager (needs to be provided by the user) and looks for methods annotated with @UseBookmark – similar to @Transactional – it is actually intended to be used together.

When this annotation is present, it uses the current set of bookmarks to start a new transaction, achieving causal consistency for that transaction. So you can choose for which methods you need to see your own changes (e.g., user profile management) and for which it doesn’t matter (e.g., a listing or recommendation).

Example of bookmark management configuration:

@EnableBookmarkManagement
public class ExampleConfiguration {

	// Provide BookmarkManager bean
	@Bean
	public BookmarkManager bookmarkManager() {
		return new CaffeineBookmarkManager();
	}
}

Example of @UseBookmark on a service method, which should use bookmark to be able to read previously created user:

public class UserService {
@UseBookmark
@Transactional(readOnly=true)
User findUser(String uuid) { … }
}

Query methods

Query methods with derived queries (a.k.a., derived finders) are one of the flagship features of any Spring Data project. Support for derived finders has been improved in this release. See the relevant documentation for a comprehensive list of supported keywords. We are planning to end enhance this even further, so if you miss a keyword and think others would find it useful as well please fill in a feature request!

Projections

Sometimes you want to return a customized shape of your data, for performance or functional reasons. Spring Data 2.0 added new support for projections which extends our existing means to project result data. Spring Data Neo4j now provides an easy way to do that by defining a projection interface like this:

public interface UserWithNameOnly {  
	String getName(); 
}

And then using it as a regular return type in your UserRepository like this:

interface UserRepository extends Neo4jRepository {
	UserWithNameOnly findByEmail(String email);
}

This won’t change anything to the query side, but will reshape data returned by your repositories to get customized representations. Check more details in the documentation.

Migration


The migration from OGM 2.1 / SDN 4.2 to the new version should be straightforward for most of you. There are detailed migration guides in the documentation:

The Future of Spring Data Neo4j


We’re working on some exciting things for the next release.

One is reactive repositories built upon the upcoming async Bolt Driver (v1.5). Auditing support has been added to the core of Spring Data Neo4j. There is still some testing to do but it should be ready for prime time in a few weeks.

Some improvements for modeling and querying will be added soon, like removing the need for a default constructor on entity classes, support for persistence constructors, or the ability to use named queries. We also are looking to add better Kotlin support.

Full Java 9 support is on the way. Even if SDN technically works on Java 9, we are working on making it fully compatible with the Jigsaw module system.

Feedback Survey


The team would love to hear your feedback on new features and current state of the two libraries. Please find a minute to fill in this short survey if you’re using SDN or OGM or plan to use it.



References


Thank You!


We want to thank the amazing team from our partner GraphAware for doing all the work around Neo4j-OGM and Spring Data Neo4j. Especially Frantisek Hartman and Nicolas Mervaillie did an outstanding job. Immediately after joining the team, Gerrit Meier quickly started to work on issues, examples and documentation and was the communication link to the Pivotal team.

Thanks a lot to Oliver Gierke – the Spring Data project lead from Pivotal – for his feedback, support and understanding. And of course many thanks to all our users for using the libraries and especially for reporting feedback, issues and suggestions.