Create procedure to stream graph into BigQuery dataset

This feature is experimental and not ready for use in production. It is only available as part of an Early Access Program, and can go under breaking changes until general availability.

The next stored procedure is for streaming graph projection entities into a given BigQuery dataset. Execute the following SQL script to create the stored procedure.

CREATE OR REPLACE PROCEDURE
  `<gcp-project-id>.<bigquery-dataset>.neo4j_gds_stream_graph`(
    graph_name STRING, (1)
    neo4j_secret STRING, (2)
    bq_project STRING, (3)
    bq_dataset STRING, (4)
    bq_node_table STRING, (5)
    bq_edge_table STRING, (6)
    neo4j_patterns ARRAY<STRING>) (7)
WITH CONNECTION `<gcp-project-id>.<external-connection-id>` OPTIONS (
    engine='SPARK',
    runtime_version='2.1',
    container_image='<region>-docker.pkg.dev/<gcp-project-id>/<repository-name>/neo4j-bigquery-connector:<version>',
    properties=[ (8)
      ("spark.driver.cores", "8"),
      ("spark.driver.maxResultSize", "4g"),
      ("spark.driver.memory", "16g")],
    description="Stream graph entities from Neo4j GDS/AuraDS to BigQuery")
  LANGUAGE python AS R"""
from pyspark.sql import SparkSession
from templates import Neo4jGDSToBigQueryTemplate

spark = (
	SparkSession
	.builder
	.appName("Neo4j -> BigQuery Connector")
	.getOrCreate()
)

template = Neo4jGDSToBigQueryTemplate()
args = template.parse_args()
template.run(spark, args)
""";
1 Name of the graph projection to be streamed. The procedure will fail if the given graph projection does not exist.
2 Secret identifier that contains Neo4j connection parameters.
3 Google Cloud Project ID that contains the BigQuery dataset.
4 Name of the BigQuery dataset.
5 Name of the BigQuery table to store streamed nodes into, for example out_nodes.
6 Name of the BigQuery table to store streamed edges into, for example out_edges.
7 List of label and relationship patterns to stream from Neo4j GDS/AuraDS into BigQuery. Patterns follow Cypher node and relationship pattern syntax. For streaming nodes of label User along with it’s id and votes properties, the pattern would look like (:User {id, votes}). For streaming relationships of type KNOWS along with it’s since property, the pattern would look like [:KNOWS {since}]. Both node and relationship patterns can be used at the same time. Patterns without any properties specified will return all properties attached to the graph entities.
8 List of properties to be passed to Dataproc Serverless environment. Refer to Spark properties for a list of valid properties.