Administration

Compute pools

The application contains a number of compute pools, which can be used to run the algorithms. These compute pools are created automatically by the application when the application is activated. A compute pool is identified by its compute pool selector, which maps to the instance family for that pool. For more information on instance families, see the Snowflake documentation on CREATE COMPUTE POOL.

In this section we assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.

You can see the available compute pool selectors by running the following command:

CALL Neo4j_Graph_Analytics.graph.show_available_compute_pools();

The following compute pool selectors are available whenever the corresponding instance family is supported in the consumer region:

'CPU_X64_XS'
'CPU_X64_M'
'CPU_X64_L'
'HIGHMEM_X64_S'
'HIGHMEM_X64_M'
'HIGHMEM_X64_L'
'GPU_NV_S'
'GPU_NV_XS'

Selecting a compute pool

When an algorithm job is invoked, a selector is chosen to specify the pool in which an algorithm is to be executed. For example, the call

CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_XS', <configuration>)

will run the WCC algorithm using the CPU_X64_XS pool.

Multiple algorithms can be run in parallel on the same compute pool or on different compute pools.

Internally, the application starts a job service to run the algorithm. The job service is executed on a compute node within the specified compute pool. Once the algorithm is finished, the job service is stopped and if there is no other job running in the pool, the compute pool is suspended. If multiple algorithms are supposed to run at the same time on the same compute pool, it is recommend to adjust the minimum and maximum number of nodes in the pool.

Managing compute pools

By default, compute pools are created with the following settings:

min_nodes = 1
max_nodes = 1
auto_resume = true
auto_suspend_secs = 180
initially_suspended = true

Hence, the compute pools are created in a suspended state and will auto-suspend when no algorithm is running to reduce cost. This aligns with typical run-on-demand usage patterns. If several jobs are run in sequence using the same compute pool, auto-suspend may be prevented so that an active node is used and the job starts faster.

The MONITOR and OPERATE privileges are granted to the APP_ADMIN role. These privileges are required to inspect the current state of the internal pools.

Altering a compute pool directly is not allowed. Instead, the application exposes a set of procedures to the administrator role to manage the compute pool settings:

-- Use a role that is granted the Neo4j_Graph_Analytics.app_admin role
USE ROLE <app_admin_role>;
-- Get the current min_nodes setting of the given pool selector
CALL Neo4j_Graph_Analytics.admin.get_min_nodes(<pool_selector>);
-- Set the min_nodes setting of the given pool selector
CALL Neo4j_Graph_Analytics.admin.set_min_nodes(<pool_selector>, <min_nodes>);
-- Get the current max_nodes setting of the given pool selector
CALL Neo4j_Graph_Analytics.admin.get_max_nodes(<pool_selector>);
-- Set the max_nodes setting of the given pool selector
CALL Neo4j_Graph_Analytics.admin.set_max_nodes(<pool_selector>, <max_nodes>);

Application warehouse

The application creates a warehouse for reading and writing data from and to consumer databases. Specifically, that warehouse is used to read Snowflake tables when projecting graphs, and it is used for writing algorithm results. It is also used for administrative queries and logging.

Similar to the compute pools, the warehouse is configured with a short auto-suspend timeout to reduce costs. All privileges on the warehouse are granted to the app_admin role, to let administrators have full control of the warehouse. It is expected therefore that administrators modify the warehouse at various times to suit the workload if deemed necessary.

The name of the warehouse is the name of the application followed by _app_warehouse, i.e. by default it is Neo4j_Graph_Analytics_app_warehouse. To inspect the warehouse, its query history and performance, you can find it in Snowsight under Admin → Warehouses. To configure it, please see https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse.

Table access privileges

The application needs to be given access to read from tables on which algorithms are to be run. If the tables or the schemas the tables belong to are known up-front, the required privileges can be granted once, during setup. Otherwise, granting read access has to be repeated at a later point for additional tables or schemas that you may wish to run algorithms on.

In the example below, we will grant the application read access to all tables in a single schema and write access a possibly different schema. If you have multiple schemas that are needed, you will need to repeat the grants for those. The grants make it possible to run jobs that read and write using these schemas. Insert names of roles and database objects as necessary. If you are not using the default application name Neo4j_Graph_Analytics, replace it with the name you used during installation.

-- Use a role with the required privileges, like 'ACCOUNTADMIN'
USE ROLE <privileged_role>;

-- Grant access to consumer data
-- The application reads consumer data to build a graph object, and it also writes results into new tables.
-- We therefore need to grant the right permissions to give the application access.
GRANT USAGE ON DATABASE <database_name> TO APPLICATION Neo4j_Graph_Analytics;
GRANT USAGE ON SCHEMA <database_name>.<schema_name> TO APPLICATION Neo4j_Graph_Analytics;

-- Required to read tabular data into a graph
GRANT SELECT ON ALL TABLES IN SCHEMA <database_name>.<schema_name> TO APPLICATION Neo4j_Graph_Analytics;
-- Required to write computation results into a table
GRANT CREATE TABLE ON SCHEMA <database_name>.<write_schema_name> TO APPLICATION Neo4j_Graph_Analytics;

Privileges for future tables and views

The privileges granted to the application in the previous section allows it to read from existing tables in the given schema. It might also be necessary to allow the application to read from future tables in the schema. Unfortunately, SELECT ON FUTURE TABLES can not be granted directly to the application. However, Snowflake provides database roles which we can use to solve this problem.

First of all, we need to create a database role that has the required privileges.

CREATE DATABASE ROLE <database_role>;

-- Grants needed for reading existing consumer data stored in tables and views.
GRANT SELECT ON ALL TABLES IN SCHEMA <database_name>.<schema_name> TO DATABASE ROLE <database_role>;
GRANT SELECT ON ALL VIEWS IN SCHEMA <database_name>.<schema_name> TO DATABASE ROLE <database_role>;
-- Grants needed for reading future consumer data stored in tables and views.
GRANT SELECT ON FUTURE TABLES IN SCHEMA <database_name>.<schema_name> TO DATABASE ROLE <database_role>;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA <database_name>.<schema_name> TO DATABASE ROLE <database_role>;
-- Grants needed for writing computation results into tables.
GRANT CREATE TABLE ON SCHEMA <database_name>.<schema_name> TO DATABASE ROLE <database_role>;

After we assigned the permissions to the database role, we need to assign the database role to the application. Note, that this is a preview feature and might not be available in all Snowflake regions or accounts.

-- Assuming the default name of 'Neo4j_Graph_Analytics' for the application
GRANT DATABASE ROLE <database_role> TO APPLICATION Neo4j_Graph_Analytics;

Any table and view that is created in the given schema will now be accessible to the application.

Consumer roles and privileges

The application comes with two application roles: app_user and app_admin.

The app_user role provides access to all algorithm procedures and utility functions. The app_admin role provides access to manage the query warehouse, and to monitor and operate the compute pools.

In the below example we create two new consumer roles, one for users and one for administrators. To these consumer roles we grant usage on the corresponding application roles. The consumer roles can then be granted to users according to how they will interact with the application.

If you are not using the default application name Neo4j_Graph_Analytics, replace it with the name you used during installation. The roles within angle brackets are to be replaced with concrete names.

-- Use a role with the required privileges, like 'ACCOUNTADMIN'
USE ROLE <privileged_role>;

-- Create a consumer role for users of the Graph Analytics application
CREATE ROLE IF NOT EXISTS <consumer_user_role>;
GRANT APPLICATION ROLE Neo4j_Graph_Analytics.app_user TO ROLE <consumer_user_role>;
-- Create a consumer role for administrators of the Graph Analytics application
CREATE ROLE IF NOT EXISTS <consumer_admin_role>;
GRANT APPLICATION ROLE Neo4j_Graph_Analytics.app_admin TO ROLE <consumer_admin_role>;

Privileges for future tables for consumers

Many of the algorithms create new table containing algorithm results. To immediately have access to such tables, without having to make special grants after each algorithm run, it is useful to grant privileges on future tables. The following sql statement does that.

GRANT ALL PRIVILEGES ON FUTURE TABLES IN SCHEMA <database_name>.<schema_name> TO ROLE <consumer_user_role>;

Any table created by the application is owned by the application. The above query does not transfer ownership of the tables. We can however use the GRANT OWNERSHIP command to transfer ownership of the table a consumer role, if so desired.

GRANT OWNERSHIP ON FUTURE TABLES IN SCHEMA <database_name>.<schema_name> TO ROLE <consumer_user_role>;

The user role is now able to operate on the tables, but also to drop them.

During the installation of the application, you are required to enable event sharing. This step is mandatory for the application to install and ensures you receive the best support experience.

For more information about event sharing, see https://other-docs.snowflake.com/en/native-apps/consumer-enable-logging

Please note that we are using the default application name Neo4j_Graph_Analytics below. If you chose a different app name during installation, please replace it with that.

To view telemetry event definitions in the application:

SHOW TELEMETRY EVENT DEFINITIONS IN APPLICATION Neo4j_Graph_Analytics;

To enable event sharing in case it was accidentally disabled:

ALTER APPLICATION Neo4j_Graph_Analytics SET AUTHORIZE_TELEMETRY_EVENT_SHARING = true;

To enable event sharing for mandatory and optional event types, for example Metrics:

ALTER APPLICATION Neo4j_Graph_Analytics SET SHARED TELEMETRY EVENTS('SNOWFLAKE$ALL');

Alternatively, review the event sharing settings in Snowsight under Data Products ⇒ Apps ⇒ Neo4j Graph Analytics selecting the Events and logs tab.

Logging

Each job that is run in the application outputs log information. The log for a specific job can be accessed using the jobId that is returned when the job is finished. Since a job is executed in a Snowflake job service, the environment will be cleaned up by Snowflake after the job is finished. This means that the job log is only available for a limited time after the job has finished.

To access the job log, you can use the following SQL command:

CALL Neo4j_Graph_Analytics.graph.job_log('job_119ac4370ae94f1da998fe7c296a6a25');

The default log level for the application is INFO. Using the runtime configuration key, the log level can be changed when running a job.

For example, to set the log level to DEBUG for a single algorithm execution, one can use the following configuration:

CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_M', {
    'project':  ...,
    'compute':  ...,
    'write':    ...,
    'runtime': { 'logging': { 'level': 'DEBUG' } }
);

The log level can be set to one of the following values: DEBUG, INFO, WARN, ERROR, or FATAL.