Visualization

For visualizing a graph projection, we recommend using the Graph Visualization for Python (neo4j-viz) library. The library is available in the official Snowflake Anaconda package repository and can be used in stored procedures, functions and notebooks.

The visualization is interactive, and supports features such as zooming, panning, moving nodes, and hovering over nodes and relationships to see their properties.

The Cora dataset
Figure 1. A visualization of the Cora dataset

Syntax

This section covers the syntax used to generate an interactive graph visualization from Snowflake tables.

The neo4j_viz library exposes a method called from_snowflake to create a VisualizationGraph object from Snowflake tables. This object can then be rendered using its render method, which produces HTML/JavaScript.

The from_snowflake method takes two mandatory positional parameters:

  • A snowflake.snowpark.Session object for the connection to Snowflake, and

  • A project configuration as defined by the Neo4j Snowflake Graph Analytics application.

Creating a visualization graph from Snowflake tables
from neo4j_viz.snowflake import from_snowflake

from snowflake.snowpark.context import get_active_session
session = get_active_session()

viz_graph = from_snowflake(
    session,
    {
        'nodeTables': [...],
        'relationshipTables': {...}
    }
)

The VisualizationGraph object returned by from_snowflake has several methods for configuring the visualization before rendering. For example, VisualizationGraph.color_nodes allows changing the coloring of nodes based on a specific property. There are more configuration methods to change, e.g., change the size of nodes and pinning nodes in place. Please refer to the documentation for all available options.

Once the VisualizationGraph object is configured as desired, it can be rendered using the render method. The method takes a visualization configuration as its only parameter. It can be used to configure the dimensions as well as the layout of the rendered graph. Please refer to the documentation for all available options.

Rendering a visualization graph
html_object = viz_graph.render(
    {...}  # Visualization configuration
)

The html_object variable now contains a string with HTML/JavaScript for the desired graph visualization. For a streamlit app inside a Snowflake notebook, this string can be rendered using the components.html method from the streamlit library.

Rendering the HTML/JavaScript string using streamlit
import streamlit.components.v1 as components
components.html(
    html_object,
    height=600  # Height in pixels
)

Example

In this example, we will visualize a small graph representing a small group of people and what musical instruments they like, in a Snowflake notebook.

Setting up the graph

We will start by creating the three tables we need, using a SQL notebook cell. One node table for the persons, one node table for the musical instruments, and one relationship table to represent the "LIKES" relationship from people to instruments.

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.PERSONS (NODEID VARCHAR);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.PERSONS VALUES
  ('Alice'),
  ('Bob'),
  ('Carol'),
  ('Dave'),
  ('Eve');

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS (NODEID VARCHAR);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS VALUES
  ('Guitar'),
  ('Synthesizer'),
  ('Bongos'),
  ('Trumpet');

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.LIKES (SOURCENODEID VARCHAR, TARGETNODEID VARCHAR);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.LIKES VALUES
  ('Alice', 'Guitar'),
  ('Alice', 'Synthesizer'),
  ('Alice', 'Bongos'),
  ('Bob',   'Guitar'),
  ('Bob',   'Synthesizer'),
  ('Carol', 'Bongos'),
  ('Dave',  'Guitar'),
  ('Dave',  'Trumpet'),
  ('Dave',  'Bongos');

Visualizing the graph projection

Now that we have our tables, we can proceed to visualize them. We use a SQL notebook cell to call the from_snowflake method. Note, that the neo4j_viz library must be selected in the notebook environment for this to work.

from neo4j_viz.snowflake import from_snowflake
from snowflake.snowpark.context import get_active_session

session = get_active_session()

viz_graph = from_snowflake(
    session,
    {
        'nodeTables': ['EXAMPLE_DB.DATA_SCHEMA.PERSONS', 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS'],
        'relationshipTables': {
            'EXAMPLE_DB.DATA_SCHEMA.LIKES': {
                'sourceTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS',
                'targetTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS'
            }
        }
    }
)

html_object = viz_graph.render()

import streamlit.components.v1 as components

components.html(html_object.data, height=600)

Output:

Rendering the HTML

The graph renders nicely, and we see that our two node types, PERSONS and INSTRUMENTS, are colored and captioned differently (according to the default settings). The relationships are rendered as arrows with the "LIKES" caption.

We can zoom in and out, pan around, move nodes, and hover over nodes and relationships to see their properties. The buttons on the top right also allow us to zoom, in addition to taking PNG snapshots of the graph.

Next, we want to compute some graph algorithms on the graph, and visualize the results. For that we need to make sure that the application has access to the tables we created earlier.

Giving the application access to the tables

-- Use a role with granting privileges
USE ROLE ACCOUNTADMIN;
USE DATABASE EXAMPLE_DB;

CREATE OR REPLACE DATABASE ROLE MY_DB_ROLE;
GRANT USAGE ON DATABASE EXAMPLE_DB TO DATABASE ROLE MY_DB_ROLE;
GRANT USAGE ON SCHEMA EXAMPLE_DB.DATA_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA EXAMPLE_DB.DATA_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA EXAMPLE_DB.DATA_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT CREATE TABLE ON SCHEMA EXAMPLE_DB.DATA_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
-- Change the app name below if you don't have the default one
GRANT DATABASE ROLE MY_DB_ROLE TO Neo4j_Graph_Analytics;

Now, we can run graph algorithms on our graph using the Neo4j Graph Analytics application.

Customizing node colors by component

First, we run the Weakly Connected Components (WCC) algorithm to identify connected components in the graph. We want to use the result to color the nodes according to their component in the visualization. If you are using a notebook, execute the following in a SQL notebook cell.

-- Change the app name here if you don't have the default one
CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_XS', {
    'project': {
        'nodeTables': ['EXAMPLE_DB.DATA_SCHEMA.PERSONS', 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS'],
        'relationshipTables': {
            'EXAMPLE_DB.DATA_SCHEMA.LIKES': {
                'sourceTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS',
                'targetTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS',
                'orientation': 'NATURAL'
            }
        }
    },
    'compute': {},
    'write': [{
        'nodeLabel': 'PERSONS',
        'outputTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS_COMPONENTS'
    },
    {
        'nodeLabel': 'INSTRUMENTS',
        'outputTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_COMPONENTS'
    }]
});

Now, that we have the output tables from the WCC algorithm, we can visualize the graph again, this time using the new tables as node tables. We will also configure the visualization to color the nodes based on the "COMPONENT" property.

from neo4j_viz.snowflake import from_snowflake
from snowflake.snowpark.context import get_active_session

session = get_active_session()

project_config = {
    'nodeTables': ['EXAMPLE_DB.DATA_SCHEMA.PERSONS_COMPONENTS', 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_COMPONENTS'],
    'relationshipTables': {
      'EXAMPLE_DB.DATA_SCHEMA.LIKES': {
        'sourceTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS_COMPONENTS',
        'targetTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_COMPONENTS',
        'orientation': 'NATURAL'
      }
    }
  }

viz_graph = from_snowflake(session, project_config)

# Color nodes according to their component
viz_graph.color_nodes(property='COMPONENT', override=True)

rendered = viz_graph.render()

import streamlit.components.v1 as components

components.html(rendered.data, height=600)

Output:

Rendering the HTML with component coloring

We can see that the nodes are now colored according to their connected component in the graph.

Customizing node sizes by PageRank

Next, we want to compute the PageRank of the nodes in the graph, and use that to size the nodes in the visualization. We start by running the PageRank algorithm using a SQL notebook cell.

-- Change the app name here if you don't have the default one
CALL Neo4j_Graph_Analytics.graph.pagerank('CPU_X64_XS', {
    'project': {
        'nodeTables': ['EXAMPLE_DB.DATA_SCHEMA.PERSONS', 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS'],
        'relationshipTables': {
            'EXAMPLE_DB.DATA_SCHEMA.LIKES': {
                'sourceTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS',
                'targetTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS',
                'orientation': 'NATURAL'
            }
        }
    },
    'compute': {},
    'write': [{
        'nodeLabel': 'PERSONS',
        'outputTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS_CENTRALITY'
    },
    {
        'nodeLabel': 'INSTRUMENTS',
        'outputTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_CENTRALITY'
    }]
});

Now that we have the PageRank results, we can visualize the graph again, this time using the new tables as node tables. We will also configure the visualization to size the nodes based on the "PAGERANK" property

from neo4j_viz.snowflake import from_snowflake
from snowflake.snowpark.context import get_active_session
session = get_active_session()

project_config = {
    'nodeTables': ['EXAMPLE_DB.DATA_SCHEMA.PERSONS_CENTRALITY', 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_CENTRALITY'],
    'relationshipTables': {
      'EXAMPLE_DB.DATA_SCHEMA.LIKES': {
        'sourceTable': 'EXAMPLE_DB.DATA_SCHEMA.PERSONS_CENTRALITY',
        'targetTable': 'EXAMPLE_DB.DATA_SCHEMA.INSTRUMENTS_CENTRALITY',
        'orientation': 'NATURAL'
      }
    }
  }

viz_graph = from_snowflake(session, project_config)

# Map each node id to its PAGERANK value as input for node resizing
sizes = {node.id: node.properties['PAGERANK'] for node in viz_graph.nodes}

# Resize nodes according to their page rank
viz_graph.resize_nodes(sizes=sizes, node_radius_min_max=(10, 100))

# Caption nodes with their original id property
for node in viz_graph.nodes:
    node.caption = node.properties["SNOWFLAKEID"]

# Render the graph
rendered = viz_graph.render()

import streamlit.components.v1 as components

components.html(rendered.data, height=600)

Output:

Rendering the HTML with PageRank sizing

We can see that the nodes are now sized according to their PageRank in the graph. Bongos are more central than guitars, who would have guessed that :)

Performance considerations

The performance of the from_snowflake methods depends on the size and complexity of the graph being visualized, as well as the machine it runs on.

The configuration of the render method takes a parameter max_allowed_nodes, which default to 10.000, because rendering large graphs can be slow and unresponsive. Chosing a different renderer can also help, see the documentation for more details.