Writing results back to BigQuery
This feature is experimental and not ready for use in production. It is only available as part of an Early Access Program, and can go under breaking changes until general availability. |
Once you have projected your data and run GDS algorithms on your data, the next step is to feed your findings back into BigQuery.
Create tables for nodes and edges
Streaming interested nodes and edges from Neo4j GDS/AuraDS into BigQuery requires dedicated, fixed-schema BigQuery tables to be created beforehand. You can pick any names for your tables, but the column names and types must adhere to the following script. Remember to replace the placeholders based on your environment.
CREATE TABLE `<gcp-project-id>.<bigquery-dataset>.out_nodes` (
node_id INT64 NOT NULL,
labels ARRAY<STRING>,
properties JSON
); (1)
CREATE TABLE `<gcp-project-id>.<bigquery-dataset>.out_edges` (
source_node_id INT64 NOT NULL,
target_node_id INT64 NOT NULL,
type STRING NOT NULL,
properties JSON
); (2)
1 | Table for node records. |
2 | Table for relationship records. |
Run the stored procedure
Once we create the required target tables, we can now execute the stored procedure. In BigQuery Query Editor, execute the following script replacing the placeholders according to your environment.
DECLARE graph_name STRING DEFAULT "example"; (1)
DECLARE neo4j_secret STRING DEFAULT "<secret-resource-name>"; (2)
DECLARE bq_project STRING DEFAULT "<gcp-project-id>"; (3)
DECLARE bq_dataset STRING DEFAULT "<bigquery-dataset>"; (4)
DECLARE bq_node_table STRING DEFAULT "out_nodes"; (5)
DECLARE bq_edge_table STRING DEFAULT "out"; (6)
DECLARE neo4j_patterns ARRAY<STRING> DEFAULT ["(:User{id})", "(:Question{id})", "[:ASKED]"]; (7)
CALL
`<gcp-project-id>.<bigquery-dataset>.neo4j_gds_stream_graph`( graph_name,
neo4j_secret,
bq_project,
bq_dataset,
bq_node_table,
bq_edge_table,
neo4j_patterns);
1 | Name of the graph projection to query from. |
2 | Resource name of the secret created in Create secrets for Neo4j connection. |
3 | Google Cloud Project ID of the BigQuery dataset. |
4 | Name of the BigQuery dataset. |
5 | Name of the BigQuery table to insert nodes into. |
6 | Name of the BigQuery table to insert edges into. |
7 | List of Cypher patterns to query from the graph projection. |
Once the stored procedure finishes execution, validate your nodes and edges are created as records in BigQuery tables provided.
