Docker image and Artifact Registry
This feature is experimental and not ready for use in production. It is only available as part of an Early Access Program, and can go under breaking changes until general availability. |
Graph Data Science for BigQuery is based on Stored Procedures for Apache Spark and requires a container image to be published in Artifact Registry in the same Google Cloud project as your stored procedure will run. This is why it is required that a Docker image needs to be pulled or built and pushed to your Artifact Registry repository.
Configure Artifact Registry
Create a new standard repository in Artifact Registry that you will push the Docker images to.
The new repository needs to be of Docker
format.
Select rest of the properties based on your environment, but ideally the new repository’s region should be near to where your BigQuery resources will reside.
Once the new repository is created, follow the Setup Instructions
on the repository details page and have your local Docker tooling ready to push images.
gcloud auth login
gcloud auth configure-docker <region>-docker.pkg.dev/<gcp-project-id>/<repository-name>
The new repository’s path will be similar to <region>-docker.pkg.dev/<gcp-project-id>/<repository-name>
and will be used with these placeholders for the rest of the documentation.
Getting the Docker image
The source code for this connector resides at this GitHub repository. In order to have the Docker image ready, you can either pull one of the pre-built images or build the image from scratch from your own environment.
Stored Procedures for Apache Spark spawn a serverless Spark environment in |
Pulling pre-built images
In order to provide an easier onboarding, we publish a pre-built Docker image which can be pulled from either of the following locations.
docker image pull --platform linux/amd64 neo4j/bigquery-connector:<version>
docker image tag neo4j/bigquery-connector:<version> <region>-docker.pkg.dev/<gcp-project-id>/<repository-name>/neo4j-bigquery-connector:<version>
docker image pull --platform linux/amd64 ghcr.com/neo4j-field/bigquery-connector:<version>
docker image tag ghcr.com/neo4j-field/bigquery-connector:<version> <region>-docker.pkg.dev/<gcp-project-id>/<repository-name>/neo4j-bigquery-connector:<version>
Building it yourself
If you prefer to build the image from scratch in your own environment, you need to execute the following commands in order.
git clone -b <version> https://github.com/neo4j-field/bigquery-connector
docker build --tag <region>-docker.pkg.dev/<gcp-project-id>/<repository-name>/neo4j-bigquery-connector:<version> --platform linux/amd64 .