Migrate a Causal Cluster
This chapter describes the necessary steps to migrate a Causal Cluster from Neo4j 3.5 directly to 4.x.
The migration of a Causal Cluster from Neo4j 3.5 to 4.x requires downtime. Therefore, it is recommended to perform a test migration in a production-like environment to get information on the duration of the downtime. |
The prerequisites and the migration steps must be completed for each cluster member. |
Prerequisites
Ensure that you have completed all tasks on the Migration checklist.
Prepare for the migration
The strategy for migrating a cluster deployment is to complete an offline copy from one cluster instance, and then use the copied store to seed the new cluster.
Remember, a migration is a single event. Do not perform independent migrations on each of your instances! There should be a single migration event and that migrated store will be your source of truth for all the other instances of the cluster. This is important because when migrating, Neo4j generate random store IDs and, if done independently, your cluster will end up with as many store IDs as instances you have. Neo4j will fail to start if that is the case. Due to this, some of the cluster migrations steps will be performed on a single instance while others will be performed on all instances. Each step will tell you where to perform the necessary actions. |
At this stage, you should elect one instance to work on. This will be the instance where the migration will actually happen. The next steps will tell you whether to perform the step on the elected instance, on the remaining instances or on all instances. |
- On each cluster member
-
-
Verify that you have shut down all the cluster members (Cores and Read Replicas). You can check the neo4j.log.
-
Install the Neo4j version that you want to migrate to on each instance. For more information on how to install the distribution that you are using, see Operations Manual → Installation.
-
Replace the neo4j.conf file with the one that you have prepared for each instance in section Prepare a new neo4j.conf file to be used by the new installation.
-
Copy all the files used for encryption, such as private key, public certificate, and the contents of the trusted and revoked directories (located in <neo4j-home>/certificates/).
If your old installation has modified configurations starting with
dbms.directories.*
or the settingdbms.active_database
, verify that the newneo4j.conf
file is configured properly to find these directories.
-
Migrate the data
- On the elected instance
-
Using the 4.x Neo4j Admin tool, migrate the data store of your 3.5 Neo4j. The
neo4j-admin copy
command also removes any inconsistent nodes, properties, and relationships and does not copy them to the newly created store.-
From the <neo4j-home> folder, run the following command to copy the data store. You need to specify the old store location and the name for the target updated database:
bin/neo4j-admin copy --from-path=/path/to/3.5.x/graph.db --to-database=<db_name>
Starting to copy store, output will be saved to: $neo4j_home/logs/neo4j-admin-copy-2020-11-26.16.07.19.log 2020-11-26 16:07:19.939+0000 INFO [StoreCopy] ### Copy Data ### 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Source: /path/to/3.5.x/graph.db (page cache 8m) 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Target: $neo4j_home/data/databases/db_name (page cache 8m) 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source. 2020-11-26 16:07:21.661+0000 INFO [o.n.i.b.ImportLogic] Import starting Import starting 2020-11-26 16:07:21.699+0000 Estimated number of nodes: 50.00 k Estimated number of node properties: 50.00 k Estimated number of relationships: 0.00 Estimated number of relationship properties: 50.00 k Estimated disk space usage: 2.680MiB Estimated required memory usage: 8.598MiB (1/4) Node import 2020-11-26 16:07:22.220+0000 Estimated number of nodes: 50.00 k Estimated disk space usage: 1.698MiB Estimated required memory usage: 8.598MiB .......... .......... .......... .......... .......... 5% ∆239ms .......... .......... .......... .......... .......... 10% ∆1ms .......... .......... .......... .......... .......... 15% ∆1ms .......... .......... .......... .......... .......... 20% ∆0ms .......... .......... .......... .......... .......... 25% ∆1ms .......... .......... .......... .......... .......... 30% ∆0ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆1ms .......... .......... .......... .......... .......... 45% ∆0ms .......... .......... .......... .......... .......... 50% ∆1ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .........- 60% ∆51ms .......... .......... .......... .......... .......... 65% ∆0ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆1ms .......... .......... .......... .......... .......... 80% ∆0ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆1ms .......... .......... .......... .......... .......... 95% ∆0ms .......... .......... .......... .......... .......... 100% ∆0ms (2/4) Relationship import 2020-11-26 16:07:22.543+0000 Estimated number of relationships: 0.00 Estimated disk space usage: 1006KiB Estimated required memory usage: 15.60MiB (3/4) Relationship linking 2020-11-26 16:07:22.879+0000 Estimated required memory usage: 7.969MiB (4/4) Post processing 2020-11-26 16:07:23.272+0000 Estimated required memory usage: 7.969MiB -......... .......... .......... .......... .......... 5% ∆356ms .......... .......... .......... .......... .......... 10% ∆0ms .......... .......... .......... .......... .......... 15% ∆1ms .......... .......... .......... .......... .......... 20% ∆0ms .......... .......... .......... .......... .......... 25% ∆0ms .......... .......... .......... .......... .......... 30% ∆1ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆0ms .......... .......... .......... .......... .......... 45% ∆1ms .......... .......... .......... .......... .......... 50% ∆0ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .......... 60% ∆0ms .......... .......... .......... .......... .......... 65% ∆1ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆0ms .......... .......... .......... .......... .......... 80% ∆0ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆0ms .......... .......... .......... .......... .......... 95% ∆1ms .......... .......... .......... .......... .......... 100% ∆0ms IMPORT DONE in 2s 473ms. Imported: 1 nodes 0 relationships 1 properties Peak memory usage: 15.60MiB 2020-11-26 16:07:24.140+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 2s 473ms. Imported: 1 nodes 0 relationships 1 properties 2020-11-26 16:07:24.668+0000 INFO [StoreCopy] Import summary: Copying of 100704 records took 4 seconds (25176 rec/s). Unused Records 100703 (99%) Removed Records 0 (0%) 2020-11-26 16:07:24.669+0000 INFO [StoreCopy] ### Extracting schema ### 2020-11-26 16:07:24.669+0000 INFO [StoreCopy] Trying to extract schema... 2020-11-26 16:07:24.920+0000 INFO [StoreCopy] ... found 1 schema definitions. The following can be used to recreate the schema: 2020-11-26 16:07:24.922+0000 INFO [StoreCopy] CALL db.createIndex('index_5c0607ad', ['Person'], ['name'], 'native-btree-1.0', {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0],`spatial.cartesian.min`: [-1000000.0, -1000000.0],`spatial.wgs-84.min`: [-180.0, -90.0],`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84.max`: [180.0, 90.0]}) 2020-11-26 16:07:24.923+0000 INFO [StoreCopy] You have to manually apply the above commands to the database when it is stared to recreate the indexes and constraints. The commands are saved to $neo4j_home/logs/neo4j-admin-copy-2020-11-26.16.07.19.log as well for reference.
When using the direct path, indexes are not automatically migrated so you have to recreate them. After running the store migration, the
neo4j-admin copy
command extracts the schema and generates a list of commands you can later use to recreate your schema on the new 4.x store. The recreate schema commands are also saved in the migration log file, located in the /logs directory.
-
Prepare for seeding the cluster
- On the elected instance
-
Use
neo4j-admin dump
to make a dump of your newly migrated database and transactions.bin/neo4j-admin dump --database=neo4j --to=$BACKUP_DESTINATION/neo4j.dump
Be aware that after you migrate, Neo4j Admin commands can differ slightly because Neo4j now supports multiple databases.
Do not yet start the server.
Seed the cluster
If you are migrating to a version of Neo4j prior to 4.3 and your migrated database is set as the default
database in neo4j.conf, you should copy the migrated database directory from the elected instance to all other instances to seed the cluster.
This step is required so that all instances have the same copy of the database when the database is started.
If the migrated database is not the default
database and the Neo4j version is 4.3+, this step is not required.
-
Copy the dump to the remaining instances.
-
Use
neo4j-admin load --from=<archive-path> --database=<db_name> --force
to replace each of your databases with the one migrated on the elected instance:bin/neo4j-admin load --from=$BACKUP_DESTINATION/neo4j.dump --database=neo4j --force
Start the cluster
- On each cluster member, including the elected instance
Before continuing, make sure the following activities happened and were completed successfully:
|
-
If everything on the list was successful, you can go ahead and start all instances of the cluster.
bin/neo4j start
or
systemctl start neo4j
-
If the migrated database is the
default
database, it should have been started automatically on instance startup and this step is not required. If the migrated database is not thedefault
database, it is still in theSTOPPED
state. You now need to start the database. On one of the cluster members, run the following command in Neo4j Browser or Cypher® Shell:Neo4j 4.0/4.1/4.2CREATE DATABASE <db_name>;
Neo4j 4.3+CREATE DATABASE <db_name> OPTIONS { existingData : 'use', existingDataSeedInstance: '<seedInstanceId>'};
Where
<seedInstanceId>
is the ID of the elected instance, which can be found by callingCALL dbms.cluster.overview()
.
Recreate indexes
The final step is to recreate any indexes or constraints that were output by the neo4j-admin copy
command.
On one of the cluster members, change the active database to the newly migrated one and run the following procedure:
CALL db.createIndex('index_5c0607ad', ['Person'], ['name'], 'native-btree-1.0', {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0],`spatial.cartesian.min`: [-1000000.0, -1000000.0],`spatial.wgs-84.min`: [-180.0, -90.0],`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84.max`: [180.0, 90.0]})
Post-migration
Recreate user data
Neo4j 3.5.x stores user and roles information in a flat file located under $NEO4J_HOME/data/dbms directory.
Starting with Neo4j 4.0, this information is stored instead on the system
database.
If you were using native users, you need to recreate them.
Go to the backed-up content of your old $NEO4J_HOME/data/dbms directory.
The authentication data is found in the auth
file, which is a column separated CSV file looking like this:
neo4j:SHA256,1066956C2D4E46C810CA39AE218AAD128854F2C08E9E831C379958CBFA6FF17D,899F9D67F2
96746766848D92B325B29EAFD9AC93940257713BA7CF4CF2B166FF:
The first column contains the username, the second column the password information.
User can be recreated using the CREATE USER
statement against the system
database, such as:
CREATE USER neo4j SET ENCRYPTED PASSWORD
‘0,1066956C2D4E46C810CA39AE218AAD128854F2C08E9E831C379958CBFA6FF17D,899F9D67F29
6746766848D92B325B29EAFD9AC93940257713BA7CF4CF2B166FF’ CHANGE NOT REQUIRED
Where the string SHA-256 is replaced by the character 0 (zero).
The role data is found in the roles files, looking like this:
admin:neo4j
This can be recreated by running the following, again against the system database:
GRANT ROLE admin TO neo4j
You can use Neo4j to parse the auth and roles files.
This will process the files and generate all CREATE USER
and GRANT ROLE
commands required to recreate all users and roles.
To do this, you simply need to move both your backed-up auth and roles files to Neo4j’s /import directory.
After that you can use the following two queries, one for users and the other for roles:
LOAD CSV FROM 'file:///auth' as line
with split(line[0], ":")[0] as user, split(line[2], ":") as hash
with user, hash[0] as pwd, CASE hash[1] WHEN "" THEN "NOT" ELSE "" END as
pwdChange
with "CREATE OR REPLACE USER "+user+" SET ENCRYPTED PASSWORD '0,"+pwd+"' CHANGE
"+pwdChange+" REQUIRED" as cypher
return *
LOAD CSV FROM 'file:///roles' as line FIELDTERMINATOR ':'
WITH line[0] as role, split(line[1],",") as users
UNWIND users as user
with "GRANT ROLE "+role+" TO "+user as cypher
return *
Each of these queries returns a list of Cypher commands which, when executed against the system
database, recreates all users and roles previously used in the Neo4j 3.5.x deployment.
Review the logs and metrics
It is advisable to review the logs and metrics to make sure everything looks good. All things going well, you should see error free logs and correctly reported metrics.
Restart the server/cluster
It is advisable to restart the server/cluster one last time just to clear everything and assume the last configuration changes.
Reactivate external applications connecting to Neo4j
After the restart and confirmation that everything was successfully migrated and healthy, you can proceed to reactivate any applications you have connecting to Neo4j. At this point, the Neo4j store migration is complete, and you need to focus on the application side, making sure that all your requests are being served and your application is on a healthy state.