External Exposure of Neo4j Clusters

This chapter describes how to route traffic from the outside world or Internet to a Neo4j cluster on version 4.3.0 or above running in Kubernetes.

If you are running a version of Neo4j earlier than 4.3.0 these instructions do not apply. Look at external exposure instructions for client routing

Overview

By default when you install Neo4j using Neo4j Labs Helm chart none of the services that are created are accessible from outside the Kubernetes cluster. To access Neo4j from outside the Kubernetes cluster we will need an externally valid DNS name or IP address that clients can connect to. That external address needs to direct queries to a Kubernetes LoadBalancer service that is balanced over Neo4j cores.

A Service with type: LoadBalancer is the only suitable option for Neo4j. Ingress cannot be used because Neo4j’s driver protocol communicates at the TCP level and most Ingress only support HTTP communication. Node Port services cannot be associated with a static IP address which makes setting up DNS and SSL very difficult. Using NodePort can additionally create configuration challenges for some Neo4j applications that expect to use port 7687 specifically for communication with Neo4j. Because of that we recommend only using LoadBalancer services.

Create Static IP addresses for inbound cluster traffic

I’m using GCP, so it is done like this. Important notes here, on GCP the region must match your GKE region, and the network tier must be premium. On other clouds, the conceptual step here is the same, but the details will differ: you only need to allocate 1 static IP address for your entire Neo4j cluster.

# Customize these next 2 for the region of your GKE cluster,
# and your GCP project ID
REGION=us-central1
PROJECT=my-gcp-project-id

gcloud compute addresses create \
  neo4j-static-ip --project=$PROJECT \
  --network-tier=PREMIUM --region=$REGION

echo "IP:"
gcloud compute addresses describe neo4j-static-ip \
  --region=$REGION --project=$PROJECT --format=json | jq -r '.address'

If you are doing this with Azure please note that the static IP address must be in the same resource group as your Kubernetes cluster, and can be created with az network public-ip create like this (just one single sample): az network public-ip create -g resource_group_name -n core01 --sku standard --dns-name neo4jcluster --allocation-method Static. The Azure SKU used must be standard, and the resource group you need can be found in the Kubernetes Load Balancer that [following the Azure Tutorial](https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough) sets up for you.

For the remainder of this tutorial, let’s assume that the core IP address I’ve allocated is as follows:

export IP=35.202.123.82

Installing the Helm Chart

From the root of this repo, navigate to stable/neo4j and issue this command to install the helm chart with a deployment name of "graph".

export DEPLOYMENT=graph
helm install $DEPLOYMENT . \
  --set core.numberOfServers=3 \
  --set acceptLicenseAgreement=yes \
  --set neo4jPassword=mySecretPassword

External Exposure

After a few minutes you’ll have a fully-formed cluster whose pods show ready, and which you can connect to from inside Kubernetes, but it will not yet be accessible from outside the Kubernetes cluster. So what we need to do next is to create a load balancer and set the loadBalancerIP to be the static IP address we reserved in the earlier step.

A load-balancer.yaml file has been provided as a template, here’s how to make one for given static IP address:

export DEPLOYMENT=graph

# Reuse IP from the earlier step here.
# These *must be IP addresses* and not hostnames, because we're
# assigning load balancer IP addresses to bind to.

echo $DEPLOYMENT with IP $IP ;
cat tools/external-exposure/load-balancer.yaml | envsubst | kubectl apply -f -

Inside of these services, we use externalTrafficPolicy: Local. To avoid adding extra latency from additional network hops inside Kubernetes. Refer to the kubernetes docs for more information on this topic.

There are other options, such as the nginx-ingress controller which can be configured to support TCP connections but in this guide we’re shooting for something as simple as possible that you can do with standard Kubernetes resource types.

Potential Trip-up point: On GKE, the only thing needed to associate the static IP to the load balancer is this loadBalancerIP field in the YAML. On other clouds, there may be additional steps to allocate the static IP to the Kubernetes cluster. Consult your local cloud documentation.

Putting it All Together

We can verify our service is running nicely like this:

$ kubectl get service | grep neo4j-external
zeke-neo4j-external   LoadBalancer   10.0.5.183   35.202.123.82     7687:30529/TCP,74.3.140843/TCP,7473:30325/TCP   115s

After all of these steps, you should end up with a cluster properly exposed. We can recover our password like so, and connect to the static IP.

export NEO4J_PASSWORD=$(kubectl get secrets graph-neo4j-secrets -o yaml | grep password | sed 's/.*: //' | base64 -d)
cypher-shell -a neo4j://34.66.183.174:7687 -u neo4j -p "$NEO4J_PASSWORD"

Additionally, since we exposed port 7474, you can go to any of the static IPs on port 7474 and end up with Neo4j browser and be able to connect.

Security: These methods send your Neo4j credentials without encryption. To secure communication with Neo4j you must also set up SSL.

Where to Go Next

If you have static IPs, you can of course associate DNS with them
Once you have DNS set up you can obtain signed SSL certificates. SSL certificates can be used with Neo4j directly (see the SSL Framework section of the Neo4j Operations manual). Or, in many situations, SSL certificates can be registered with the Load Balancer instead.
This in turn will let you use https and neo4j+s protocols to communicate with Neo4j.