Considerations and planning
This chapter describes what to consider when planning your Neo4j DBMS upgrade or migration.
Planning an upgrade or migration is crucial to ensure that everything goes smoothly, and you are covered in case of failure. This section aims to highlight some important items to keep an eye on.
General considerations
Duration
It is impossible to understand beforehand how long an upgrade or a migration will take, due to the number of moving parts involved, such as (but not restricted to):
-
The size of your data store.
-
Do you need to migrate the store format?
-
Do indexes need to be migrated?
-
How big are your indexes?
-
How fast is your hardware?
Because of this, it is recommended to prepare an environment where you can test the upgrade or migration process back-to-back. This will give you more accurate timings than you can ever estimate as well as an idea of what to expect in terms of duration for the process. As a rule of thumb, the data migration part of an upgrade or a migration process (not the entire process) takes around 50\~55 minutes for a 600GB store.
Configuration changes
In PATCH
and MINOR
upgrades, you do not need to change the configuration settings.
If you leave the neo4j.conf file unchanged, everything should work as in the old version.
However, some versions may include new features, so it is always good to review the changes to the Configuration settings and Procedures and update the settings if needed.
New MAJOR
versions introduce changes to configuration settings, such as new configuration names, deprecated and removed features.
It is advised to check the changes to the configuration settings to understand all changes required.
At this stage, a good practice is to go through your current neo4j.conf and capture any non-default configurations you may have. You need to add these configurations to the neo4j.conf file of your Neo4j 4.x.
You can use the following to strip all comments/empty lines of neo4j.conf file:
|
Logs
All log formats/layouts have changes in Neo4j 4.x. Some are minor, others more significant, so take your time to familiarize yourself with the new formats and information they contain. If you are consuming these logs and generating alarms based on them, it is worth checking the impact of the log format changes.
For more information about the available logs and how to configure them, including log rotation and retention, see the Operations Manual → Logging of the version you are upgrading to.
Metrics
There are three important things to note on the metrics front:
-
The metrics that are enabled by default have been changed in the 4.2 version. Any specific metrics that you want to be enabled must be specified in the configuration.
-
From 4.0 onwards, the rules for naming metrics have changed to include the database name prefix.
-
Neo4j 4.0 version also introduces namespaces for metrics, which are disabled by default.
If you are migrating from 3.5 to 4.x and you have set up a monitoring dashboard, either built in-house or from a 3rd party vendor, you should understand how these changes will impact the dashboards after the migration. The Operations Manual contains two useful sections on this topic: |
Embedded deployments
For users with embedded deployments (when you include Neo4j in your project), from version 4.2 onwards, Neo4j supports running a Causal Cluster in embedded mode. However, coming from Neo4j HA embedded, there are a small number of changes required. For more information, see Java Reference → Using Neo4j embedded in Java applications and Operations Manual → Embedded usage.
Migration considerations
Downtime
When migrating to a newer Neo4j MAJOR
version, only offline migrations are supported.
Therefore, it requires planning for some downtime.
Plan for this accordingly.
Because each case is unique, it is recommended to run one or several test migrations to understand how long the process will take back-to-back.
That information will allow you to better plan for the downtime window you will need.
Disk space considerations
A migration requires substantial free disk space, as it makes an entire copy of the database and creates temporary files. It is also recommended taking a backup of your production store in case something goes wrong. Because of the reasons listed above, it is good to reserve two times the size of the database directory (three times in case you take a backup beforehand) for migration purposes.
Java version
Neo4j 4.x runs on Java 11. The following table shows the compatible Java Virtual Machine (JVM) to run a Neo4j instance.
Neo4j version | Java version |
---|---|
4.x |
Java SE 11 |
3.x |
Java SE 8 |
If you have other Java applications running on the machine hosting Neo4j, make sure those applications are compatible with the Java version your Neo4j is running on. Alternatively, you should configure to run multiple JDKs on the same machine. |
Database naming rules
With the introduction of multiple databases, the rules for naming a database have changed. For example, it is no longer possible to use an underscore in a database name. For a full list of naming rules, see Operations Manual → Administrative commands.
Application code
Depending on how your application is interacting with Neo4j, you should be prepared to review your application code. Neo4j 4.x has changes that may impact your application such as (but not restricted to):
All breaking changes that may affect your application and Neo4j can be found in Breaking changes between Neo4j 3.5 and Neo4j 4.x.
Drivers
Neo4j’s official drivers have some significant and breaking changes you need to be aware of. The Breaking changes between Neo4j 1.7 drivers and Neo4j 4.x drivers section contains all information and lists all the breaking changes.
The following are some key changes that may cause confusion:
-
Starting with Neo4j 4.0, the versioning scheme for the database, driver, and protocol are all aligned. For supported drivers, this means that the version number goes from 1.7 to 4.0. This is merely a cosmetic change and version 4.0 of the drivers is in fact only one release ahead of 1.7.
The driver version was version bumped from 1.7 to 4.0 for each official Neo4j driver. There are no driver versions 1.8, 2.x, and 3.x.
-
The driver’s default configuration for encrypted is now
false
. A 4.x driver only attempts plain text connections by default. -
When encryption is explicitly enabled, connections with holding self-signed certificates will fail on certificate verification by default. On Neo4j 4.x the default trust mode is to trust the CAs that are trusted by the operating system.
-
v1
is removed from drivers’ package name. -
The
neo4j://
scheme replacesbolt+routing://
and can be used for both clustered and single-instance configurations. -
With 4.0 servers, session instances should now be acquired against a specific database.
-
Bookmark has changed from a string, and a list of strings, to a Bookmark object.
-
Several language specific driver changes.
The list above does not reflect the entirety of changes in the drivers, so it is imperative to read through the Breaking changes between Neo4j 1.7 drivers and Neo4j 4.x drivers section. Failing do to so may result in unwanted problems when your application tries to connect to Neo4j after the migration. As a tip, you can also check the Neo4j Drivers documentation (for all officially supported languages), which will guide you through all the drivers features configurations, such as deeper dive on Authentication, Asynchronous sessions, and Reactive sessions (client-side back-pressure). |
Plugins (including custom plugins)
Take note of the plugins you are using and make sure they are compatible with Neo4j 4.x.
The following are the most commonly used plugins/plugin types:
-
One of the most used plugins is APOC, a procedure library that extends the Neo4j functionality. If you are using APOC, check the Version Compatibility Matrix and make sure you plan to upgrade APOC alongside your Neo4j migration.
To quickly check your APOC version, you can run
RETURN apoc.version();
-
If you are using Neo4j Bloom or Graph Data Science Library (GDSL), you can find the most recent versions for these products in the Neo4j Deployment Center. Be aware that some previous versions of Bloom and GDSL were not initially compatible with Neo4j 4.0. This has been fully rectified, and all of the product suite is compatible with the 4.x series. Therefore, it is recommended to simply upgrade to the latest versions of these products.
-
If you have developed any custom plugins, you should review them as you would your application code. The several changes in Neo4j 4.x can impact the behaviour of these custom plugins, therefore, it is highly-advised to plan some time for this as well.
Other 3rd-party software and tools
Be mindful of any other 3rd-party software and tools you are using alongside Neo4j. Maybe you have leveraged operational scripts to install, manage, backup, or monitor your Neo4j deployment. You might also have set alarms and built complete monitoring dashboards. You need to revise these as there have been changes in metrics (as mentioned in section Metrics) and the Neo4j operational tools now account for multiple databases. Therefore, it is recommended to review all scripts/tools/3rd party software and make sure they are prepared and compatible with Neo4j 4.x.