Write safer user-defined functions & procedures with the Neo4j Procedure Compiler

Senior Software Engineer at Neo4j

March 2, 2017

5 min read

Check your user-defined functions & user-defined procedures with the Neo4j Procedure Compiler

[As community content, this post reflects the views and opinions of the particular author and does not necessarily reflect the official stance of Neo4j.]

As you may probably know, Neo4j 3.0 introduced a concept familiar to database users: user-defined procedures. Neo4j 3.1 came out a bit later with another familiar addition: user-defined functions. Neo4j 3.2 will even allow custom aggregate functions.

What it means is that you are now able to directly extend Cypher! With these new mechanisms built into Neo4j, one can easily interact with other data stores, enrich import/export functionalities and implement shiny graph algorithms. One striking example is the community repository called APOC. There lies a goldmine of procedure and function examples.

Not only are these additions very useful and much welcome, but the process of writing and publishing them is also very smooth. Indeed, you will get a detailed error feedback when you deploy your new procedure/function if the latter is invalid (in annotations, parameter or return types or injection points).

The first time I played with Neo4j procedures was around May or June of 2016. While I was quite impressed by the simplicity of use, I instantly noticed a possible improvement (that also related to functions). What if most of the common errors could be caught before deploying the new code? I ended up creating a small project for that: The Neo4j Procedure Compiler.

How the Neo4j procedure compiler helps you write safer code

By collaborating with Tobias from the Neo4j team, the Procedure Compiler even made it into Neo4j 3.1.0 (and upwards)! This is the nice part of Neo4j being open source: anyone in the community can contribute and make a difference!

Let’s now see how it works and how we can use it to write safer code.

Neo4j is written in Java and Scala and therefore requires procedures and functions to be written with a language supported by the Java Virtual Machine (JVM). The most idiomatic way is to write them in Java and rely on @Procedure/@UserFunction annotations. This is a prerequisite of the Neo4j Procedure Compiler (the technical reason for this is that the project relies on Javac annotation processing to detect and analyze the procedure and function methods under compilation).

Once the compilation of your project starts and the Java compiler enables the Neo4j Procedure Compiler, the latter will start analyzing a representation of your source code and detect procedures and functions. The Procedure Compiler will then check some common rule violations:

Each procedure or function parameter must be annotated with @Name.
Procedure methods return type must be java.util.stream.Stream<T>.
Fields annotated with @Context must be public and non-final.
All other fields in the class must be static.
If the type of @Context fields is not GraphDatabaseService or Log, a warning is emitted by default.
@Procedure/@UserFunction enclosing class must have a public no-arg constructor.
@UserFunction method cannot belong to a class in the root (empty) package.
Map parameters (and record fields) must have String as key type¹.
The types used for parameters and record fields must be supported. (See procedure and function documentation.)
@PerformsWrites cannot be used in conjunction with @Procedure#mode.
Procedure and function names must be unique².

As you can see here, it can be easy to make mistakes as many rules surround procedures and functions. The key takeaway is that these kinds of common mistakes are detected at compilation time, way before the deployment to a Neo4j instance, so you can get the earliest possible feedback.

Getting started with the Neo4j procedure compiler

Now that you are convinced of using the Procedure Compiler, how do you get started?

Enabling the Neo4j Procedure Compiler is quite straightforward: You just need to add it to your classpath, and it will be detected and used by the Java Compiler!

With Maven:

<dependency>
    <groupId>org.neo4j</groupId>
    <artifactId>procedure-compiler</artifactId>
    <version>${neo4j.version}</version>
    <scope>compile</scope>
    <optional>true</optional>
</dependency>

With Gradle:

compileOnly 	group: 'org.neo4j',
                name: 'procedure-compiler',
                version:'${neo4j.version}' {transitive = false}

If you are a Gradle 3.4 user, there is now a better way to include annotation processors, as described here.

Let’s see it in action!

Next time you write a user-defined procedure or function, start using the Neo4j Procedure Compiler! While it cannot catch all the possible errors, it will definitely become a precious time saver. It is already used in the aforementioned APOC library and has saved the contributors a lot of time and improved the feedback cycle.

Take your Neo4j skills up a notch:
Take our online training class, Neo4j in Production, and learn how scale the world’s leading graph database to unprecedented levels.

Sign Me Up

¹ The nice part about compile-time annotation processing is that generic types are not erased yet and stricter checks can be performed ! The Procedure Compiler forbids raw types whereas Neo4j runtime allows it (as long as the runtime type is compatible).

² This is only a partial check. Functions and procedures may be deployed to a single Neo4j instance and come from different projects. The Procedure Compiler can only detect name collision within a project.