2.5. Docker

This article covers running Neo4j in a Docker container.

Docker does not run natively on OS X or Windows. For running Docker on OS X and Windows please consult the Docker documentation.

2.5.1. Overview

By default the Docker image exposes three ports for remote access:

  • 7474 for HTTP.
  • 7473 for HTTPS.
  • 7687 for Bolt.

It also exposes two volumes:

  • /data to allow the database to be persisted outside its container.
  • /logs to allow access to Neo4j log files.
docker run \
    --publish=7474:7474 --publish=7687:7687 \
    --volume=$HOME/neo4j/data:/data \
    --volume=$HOME/neo4j/logs:/logs \
    neo4j:3.1

Point your browser at http://localhost:7474 on Linux or http://$(docker-machine ip default):7474 on OS X.

All the volumes in this documentation are stored under $HOME in order to work on OS X where $HOME is automatically mounted into the machine VM. On Linux the volumes can be stored anywhere.

By default Neo4j requires authentication and requires you to login with neo4j/neo4j at the first connection and set a new password. You can set the password for the docker container directly by specifying --env NEO4J_AUTH=neo4j/<password> in your run directive. Alternatively, you can disable authentication by specifying --env NEO4J_AUTH=none instead.

2.5.2. Neo4j editions

Tags are available for both Neo4j Community and Enterprise editions. Version-specific Enterprise Edition tags have an -enterprise suffix, for example neo4j:3.1.0-enterprise. Community Edition tags have no suffix, for example neo4j:3.1.0. The latest Neo4j Enterprise Edition release is available as neo4j:enterprise.

2.5.3. Docker configuration

2.5.3.1. File descriptor limit

Neo4j may use a large number of file descriptors if many indexes are in use or there is a large number of simultaneous database connections.

Docker controls the number of open file descriptors in a container; the limit depends on the configuration of your system. We recommend a limit of at least 40000 for running Neo4j.

To check the limit on your system, run this command:

docker run neo4j:3.1 \
    bash -c 'echo Soft limit: $(ulimit -Sn); echo Hard limit: $(ulimit -Hn)'

To override the default configuration for a single container, use the --ulimit option like this:

docker run \
    --detach \
    --publish=7474:7474 --publish=7687:7687 \
    --volume=$HOME/neo4j/data:/data \
    --volume=$HOME/neo4j/logs:/logs \
    --ulimit=nofile=40000:40000
    neo4j:3.1

2.5.4. Neo4j configuration

The default configuration provided by this image is intended for learning about Neo4j, but must be modified to make it suitable for production use. In particular the memory assigned to Neo4j is very limited (see NEO4J_CACHE_MEMORY and NEO4J_HEAP_MEMORY below), to allow multiple containers to be run on the same server. You can read more about configuring Neo4j in the Section A.1, “Configuration settings”.

There are three ways to modify the configuration:

  • Set environment variables.
  • Mount a /conf volume.
  • Build a new image.

Which one to choose depends on how much you need to customize the image.

2.5.4.1. Environment variables

Pass environment variables to the container when you run it.

docker run \
    --detach \
    --publish=7474:7474 --publish=7687:7687 \
    --volume=$HOME/neo4j/data:/data \
    --volume=$HOME/neo4j/logs:/logs \
    --env=NEO4J_dbms_memory_pagecache_size=4G \
    neo4j:3.1

Any configuration value (see Section A.1, “Configuration settings”) can be passed using the following naming scheme:

  • Prefix with NEO4J_.
  • Underscores must be written twice: _ is written as __.
  • Periods are converted to underscores: . is written as _.

As an example, dbms.tx_log.rotation.size could be set by specifying the following argument to docker:

--env=NEO4J_dbms_tx__log_rotation_size

The following environment variables are also available and will be converted to the general form above:

  • NEO4J_AUTH: controls authentication, set to none to disable authentication or neo4j/<password> to override the default password (see Chapter 7, Security for details).
  • NEO4J_dbms_memory_pagecache_size: the size of Neo4j’s native-memory cache, defaults to 512M
  • NEO4J_dbms_memory_heap_maxSize: the size of Neo4j’s heap, defaults to 512M
  • NEO4J_dbms_txLog_rotation_retentionPolicy: the retention policy for logical logs, defaults to 100M size
  • NEO4J_dbms_allowFormatMigration: set to true to enable upgrades, defaults to false (see the Section 5.2, “Single-instance upgrade” for details)
Neo4j Enterprise Edition

The following settings control features that are only available in the Enterprise Edition of Neo4j.

  • NEO4J_dbms_mode: the database mode, defaults to SINGLE, set to CORE or READ_REPLICA for Causal Clustering, set to HA for Highly Available clusters.
Causal Cluster settings
  • NEO4J_causalClustering_expectedCoreClusterSize: the initial cluster size (number of Core instances) at startup.
  • NEO4J_causalClustering_initialDiscoveryMembers: the network addresses of an initial set of Core cluster members.
  • NEO4J_causalClustering_discoveryAdvertisedAddress: hostname/ip address and port to advertise for member discovery management communication.
  • NEO4J_causalClustering_transactionAdvertisedAddress: hostname/ip address and port to advertise for transaction handling.
  • NEO4J_causalClustering_raftAdvertisedAddress: hostname/ip address and port to advertise for cluster communication.

See below for examples of how to configure Causal Clustering.

Highly Available cluster settings
  • NEO4J_ha_serverId: the id of the server, must be unique within a cluster
  • NEO4J_ha_host_coordination: the address (including port) used for cluster coordination in HA mode, this must be resolvable by all cluster members
  • NEO4J_ha_host_data: the address (including port) used for data transfer in HA mode, this must be resolvable by all cluster members
  • NEO4J_ha_initialHosts: comma-separated list of other members of the cluster

See below for an example of how to configure HA clusters.

2.5.4.2. /conf volume

To make arbitrary modifications to the Neo4j configuration, provide the container with a /conf volume.

docker run \
    --detach \
    --publish=7474:7474 --publish=7687:7687 \
    --volume=$HOME/neo4j/data:/data \
    --volume=$HOME/neo4j/logs:/logs \
    --volume=$HOME/neo4j/conf:/conf \
    neo4j:3.1

Any configuration files in the /conf volume will override files provided by the image. This includes values that may have been set in response to environment variables passed to the container by Docker. So if you want to change one value in a file you must ensure that the rest of the file is complete and correct.

If you use a configuration volume you must make sure to listen on all network interfaces. This can be done by setting dbms.connectors.default_listen_address=0.0.0.0.

To dump an initial set of configuration files, run the image with the dump-config command.

docker run --rm\
    --volume=$HOME/neo4j/conf:/conf \
    neo4j:3.1 dump-config

2.5.4.3. Build a new image

For more complex customization of the image you can create a new image based on this one.

FROM neo4j:3.1

If you need to make your own configuration changes, we provide a hook so you can do that in a script:

COPY extra_conf.sh /extra_conf.sh

Then you can pass in the EXTENSION_SCRIPT environment variable at runtime to source the script:

docker run -e "EXTENSION_SCRIPT=/extra_conf.sh" cafe12345678

When the extension script is sourced, the current working directory will be the root of the Neo4j installation.

2.5.5. Neo4j Causal Cluster mode

This feature is available in Neo4j Enterprise Edition.

In order to run Neo4j in CC mode under Docker you need to wire up the containers in the cluster so that they can talk to each other. Each container must have a network route to each of the others and the NEO4J_causalClustering_expectedCoreClusterSize and NEO4J_causalClustering_initialDiscoveryMembers environment variables must be set for cores. Read replicas only need to define NEO4J_causalClustering_initialDiscoveryMembers.

Within a single Docker host, this can be achieved as follows.

docker network create --driver=bridge cluster

docker run --name=core1 --detach --network=cluster \
         --publish=7474:7474 --publish=7687:7687 \
         --env=NEO4J_dbms_mode=CORE \
         --env=NEO4J_causalClustering_expectedCoreClusterSize=3 \
         --env=NEO4J_causalClustering_initialDiscoveryMembers=core1:5000,core2:5000,core3:5000 \
         neo4j:3.1-enterprise

docker run --name=core2 --detach --network=cluster \
         --env=NEO4J_dbms_mode=CORE \
         --env=NEO4J_causalClustering_expectedCoreClusterSize=3 \
         --env=NEO4J_causalClustering_initialDiscoveryMembers=core1:5000,core2:5000,core3:5000 \
         neo4j:3.1-enterprise

docker run --name=core3 --detach --network=cluster \
         --env=NEO4J_dbms_mode=CORE \
         --env=NEO4J_causalClustering_expectedCoreClusterSize=3 \
         --env=NEO4J_causalClustering_initialDiscoveryMembers=core1:5000,core2:5000,core3:5000 \
         neo4j:3.1-enterprise

Additional instances can be added to the cluster in an ad-hoc fashion. A read replica can for example be added with:

docker run --name=read_replica1 --detach --network=cluster \
         --env=NEO4J_dbms_mode=READ_REPLICA \
         --env=NEO4J_causalClustering_initialDiscoveryMembers=core1:5000,core2:5000,core3:5000 \
         neo4j:3.1-enterprise

When each container is running on its own physical machine and docker network is not used, it is necessary to define the advertised addresses to enable communication between the physical machines. Each instance would then be invoked similar to:

docker run --name=neo4j-core --detach \
         --publish=7474:7474 --publish=7687:7687 \
         --publish=5000:5000 --publish=7000:7000 \
         --env=NEO4J_dbms_mode=CORE \
         --env=NEO4J_causalClustering_expectedCoreClusterSize=3 \
         --env=NEO4J_causalClustering_initialDiscoveryMembers=<core1-public-address>:5000,<core2-public-address>:5000,<core3-public-address>:5000 \
         --env=NEO4J_causalClustering_discoveryAdvertisedAddress=<public-address>:5000 \
         --env=NEO4J_causalClustering_transactionAdvertisedAddress=<public-address>:6000 \
         --env=NEO4J_causalClustering_raftAdvertisedAddress=<public-address>:7000 \
         neo4j:3.1-enterprise

Where <public-address> is the public hostname or ip-address of the machine.

See Section 4.1.3, “Create a new Causal Cluster” for more details of Neo4j Casual Clustering.

2.5.6. Neo4j Highly Available mode

This feature is available in Neo4j Enterprise Edition.

In order to run Neo4j in HA mode under Docker you need to wire up the containers in the cluster so that they can talk to each other. Each container must have a network route to each of the others and the NEO4J_ha_host_coordination, NEO4J_ha_host_data and NEO4J_ha_initialHosts environment variables must be set accordingly (see above).

Within a single Docker host, this can be achieved as follows.

docker network create --driver=bridge cluster

docker run --name=instance1 --detach --publish=7474:7474 --publish=7687:7687 --net=cluster --hostname=instance1 \
    --volume=$HOME/neo4j/logs1:/logs \
    --env=NEO4J_dbms_mode=HA --env=NEO4J_ha_serverId=1 \
    --env=NEO4J_ha_host_coordination=instance1:5001 --env=NEO4J_ha_host_data=instance1:6001 \
    --env=NEO4J_ha_initialHosts=instance1:5001,instance2:5001,instance3:5001 \
    neo4j:3.1-enterprise

docker run --name=instance2 --detach --publish 7475:7474 --publish=7688:7687 --net=cluster --hostname=instance2 \
    --volume=$HOME/neo4j/logs2:/logs \
    --env=NEO4J_dbms_mode=HA --env=NEO4J_ha_serverId=2 \
    --env=NEO4J_ha_host_coordination=instance2:5001 --env=NEO4J_ha_host_data=instance2:6001 \
    --env=NEO4J_ha_initialHosts=instance1:5001,instance2:5001,instance3:5001 \
    neo4j:3.1-enterprise

docker run --name=instance3 --detach --publish 7476:7474 --publish=7689:7687 --net=cluster --hostname=instance3 \
    --volume=$HOME/neo4j/logs3:/logs \
    --env=NEO4J_dbms_mode=HA --env=NEO4J_ha_serverId=3 \
    --env=NEO4J_ha_host_coordination=instance3:5001 --env=NEO4J_ha_host_data=instance3:6001 \
    --env=NEO4J_ha_initialHosts=instance1:5001,instance2:5001,instance3:5001 \
    neo4j:3.1-enterprise

See the Section B.2, “Set up a Highly Available cluster” for more details of Neo4j Highly Available mode.

2.5.7. User-defined procedures

To install user-defined procedures, provide a /plugins volume containing the jars.

docker run --publish 7474:7474 --publish=7687:7687 --volume=$HOME/neo4j/plugins:/plugins neo4j:3.1

See Developer Manual → Procedures for more details on procedures.

2.5.8. Cypher shell

The Neo4j shell can be run locally within a container using a command like this:

docker exec --interactive --tty <container> bin/cypher-shell

2.5.9. Encryption

The Docker image can expose Neo4j’s native TLS support. To use your own key and certificate, provide an /ssl volume with the key and certificate inside. The files must be called neo4j.key and neo4j.cert. You must also publish port 7473 to access the HTTPS endpoint.

docker run --publish 7473:7473 --publish=7687:7687 --volume $HOME/neo4j/ssl:/ssl neo4j:3.1