Model Catalog

This section details the model catalog operations available to manage named trained models within the Neo4j Graph Data Science library.

Some graph algorithms use trained models in their computation. A model is generally a mathematical formula representing a real-world or fictitious entities. Each algorithm requiring a trained model provides the formulation and means to compute this model (see GraphSage train syntax).

The model catalog is a concept within the GDS library that allows storing and managing multiple trained models by name.

This chapter explains the available model catalog operations.

Name Description

gds.beta.model.exists

Checks if a named model is available in the catalog.

gds.beta.model.list

Prints information about models that are currently available in the catalog.

gds.beta.model.drop

Drops a named model from the catalog.

gds.alpha.model.store

Stores a names model from the catalog on disk.

gds.alpha.model.load

Loads a named and stored model from disk.

gds.alpha.model.delete

Removes a named and stored model from disk.

gds.alpha.model.publish

Makes a model accessible to all users.

Training models is a responsibility of the corresponding algorithm and is provided by a procedure mode - train. Training, using, listing, and dropping named models are management operations bound to a Neo4j user. Models trained by a different Neo4j user are not accessible at any time.

1. Check if a model exists in the catalog

We can check if a model is available in the catalog by looking up its name.

Check if a model exists in the catalog:
CALL gds.beta.model.exists('my-model') YIELD exists;
Table 1. Results
exists

true

2. List models available in the catalog

Once we have trained models in the catalog we can see information about either all of them or a single model using its name

Listing detailed information about all models:
CALL gds.beta.model.list()
YIELD
  modelInfo,
  loaded,
  stored,
  shared
Table 2. Results
modelInfo loaded stored shared

{modelName=my-model, modelType=example-model-type}

true

false

false

Listing detailed information about specific model:
CALL gds.beta.model.list('my-model')
YIELD
  modelInfo,
  loaded,
  stored,
  shared
Table 3. Results
modelInfo loaded stored shared

{modelName=my-model, modelType=example-model-type}

true

false

false

The full set of fields returned from this procedure are:

  • modelInfo: detailed information for the trained model

    • modelName: String: the saved model name.

    • modelType: String: the type of the model, i.e. GraphSAGE.

    • can also contain algorithm specific model details.

  • trainConfig: the configuration used for training the model.

  • graphSchema: the schema of the graph on which the model was trained.

  • stored: True, if the model is stored on disk.

  • loaded: True, if the model is loaded in the in-memory model catalog.

  • creationTime: the time at which the model was registered in the catalog.

  • shared: a boolean flag indicating if the model is published.

3. Removing models from the catalog

If we no longer need a trained model we can remove it from the catalog.

Remove a model from the catalog:
CALL gds.beta.model.drop('my-model')
YIELD
  modelInfo,
  loaded,
  stored,
  shared
Table 4. Results
modelInfo loaded stored shared

{modelName=my-model, modelType=example-model-type}

true

false

false

The full set of fields returned from this procedure are:

  • modelInfo: detailed information for the trained model

    • modelName: String: the saved model name.

    • modelType: String: the type of the model, i.e. GraphSAGE.

    • can also contain algorithm specific model details.

  • trainConfig: the configuration used for training the model.

  • graphSchema: the schema of the graph on which the model was trained.

  • stored: True, if the model is stored on disk.

  • loaded: True, if the model is loaded in the in-memory model catalog.

  • creationTime: the time at which the model was registered in the catalog.

  • shared: a boolean flag indicating if the model is published.

If the model name does not exist, an error will be raised.

4. Storing models on disk

The model store feature is in the alpha tier.

The model catalog exists as long as the Neo4j instance is running. When Neo4j is restarted, models are no longer available in the catalog and need to be trained again. This can be prevented by storing a model on disk.

The location of the stored models can be configured via the configuration parameter gds.model.store_location in the neo4j.conf. The location must be a directory and writable by the Neo4j process.

The gds.model.store_location parameter must be configured for this feature.

4.1. Storing models from the catalog on disk

Store a model on disk:
CALL gds.alpha.model.store('my-model')
YIELD
  modelName,
  storeMillis
Results
  • modelName: The name of the stored model.

  • storeMillis: The number of milliseconds it took to store the model.

4.2. Loading models from disk

GDS will discover available models from the configured store location upon database startup. During discovery, only model metadata is loaded, not the actual model data. In order to use a stored model, it has to be explicitly loaded.

Store a model on disk:
CALL gds.alpha.model.load('my-model')
YIELD
  modelName,
  loadMillis
Results
  • modelName: The name of the stored model.

  • loadMillis: The number of milliseconds it took to load the model.

If the model is already loaded, nothing happens. To verify if a model is loaded, we can use the gds.beta.model.list procedure. The procedure returns flags to indicate if the model is stored and if the model is loaded into memory.

4.3. Deleting models from disk

To remove a stored model from disk, it has to be deleted. This is different from dropping a model. Dropping a model will remove it from the in-memory model catalog, but not from disk. Deleting a model will remove it from disk, but keep it in the in-memory model catalog if it was already loaded.

Store a model on disk:
CALL gds.alpha.model.delete('my-model')
YIELD
  modelName,
  deleteMillis
Results
  • modelName: The name of the stored model.

  • deleteMillis: The number of milliseconds it took to delete the model.

5. Publishing models

Publishing models is an alpha tier feature.

By default, a trained model is visible to the user that created it. Making a model accessible to other users can be achieved by publishing it.

5.1. Publishing a model

Publishing trained model:
CALL gds.alpha.model.publish('my-model')
YIELD
  modelInfo,
  trainConfig,
  graphSchema,
  stored,
  loaded,
  creationTime,
  shared

The full set of fields returned from this procedure are:

  • modelInfo: detailed information for the trained model

    • modelName: String: the saved model name.

    • modelType: String: the type of the model, i.e. GraphSAGE.

    • can also contain algorithm specific model details.

  • trainConfig: the configuration used for training the model.

  • graphSchema: the schema of the graph on which the model was trained.

  • stored: True, if the model is stored on disk.

  • loaded: True, if the model is loaded in the in-memory model catalog.

  • creationTime: the time at which the model was registered in the catalog.

  • shared: a boolean flag indicating if the model is published.