Catalog Endpoints

enum graphdatascience.procedure_surface.api.catalog.Aggregation(value)
Member Type:

str

Valid values are as follows:

NONE = <Aggregation.NONE: 'NONE'>
SINGLE = <Aggregation.SINGLE: 'SINGLE'>
SUM = <Aggregation.SUM: 'SUM'>
MIN = <Aggregation.MIN: 'MIN'>
MAX = <Aggregation.MAX: 'MAX'>
COUNT = <Aggregation.COUNT: 'COUNT'>

The Enum and its members also have the following methods:

__new__(value)
class graphdatascience.procedure_surface.api.catalog.CatalogEndpoints
abstract construct(graph_name: str, nodes: DataFrame | list[DataFrame], relationships: DataFrame | list[DataFrame] | None = None, concurrency: int | None = None, undirected_relationship_types: list[str] | None = None) GraphV2

Construct a graph from a list of node and relationship dataframes.

Parameters:
  • graph_name (str) – Name of the graph to construct

  • nodes (DataFrame | list[DataFrame]) –

    Node dataframes. A dataframe should follow the schema:

    • nodeId to identify uniquely the node overall dataframes

    • labels to specify the labels of the node as a list of strings (optional)

    • other columns are treated as node properties

  • relationships (DataFrame | list[DataFrame] | None) –

    Relationship dataframes. A dataframe should follow the schema:

    • sourceNodeId to identify the start node of the relationship

    • targetNodeId to identify the end node of the relationship

    • relationshipType to specify the type of the relationship (optional)

    • other columns are treated as relationship properties

  • concurrency (int | None) – Number of concurrent threads to use.

  • undirected_relationship_types (list[str] | None) – List of relationship types to treat as undirected.

Returns:

Constructed graph object.

Return type:

GraphV2

property datasets: DatasetEndpoints

Endpoints for loading predefined datasets into the graph catalog.

abstract drop(G: GraphV2 | str, fail_if_missing: bool = True) GraphInfo | None

Drop a graph from the graph catalog.

Parameters:
  • G (GraphV2 | str) – Graph to drop by name of object.

  • fail_if_missing (bool) – Whether to fail if the graph is missing

Returns:

GraphV2 metadata object containing information like node count.

Return type:

GraphListResult

abstract filter(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, concurrency: int | None = None, job_id: str | None = None) GraphWithFilterResult

Create a subgraph of a graph based on a filter expression.

Parameters:
  • G (GraphV2) – Graph object to use

  • (str) (relationship_filter) – Name of subgraph to create

  • (str) – Filter expression for nodes

  • (str) – Filter expression for relationships

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • graph_name (str)

  • node_filter (str)

  • relationship_filter (str)

Returns:

tuple of the filtered graph object and the information like graph name, node count, relationship count, etc.

Return type:

GraphWithFilterResult

abstract generate(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = None, relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = None, allow_self_loops: bool | None = None, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) GraphWithGenerationStats

Generates a random graph and store it in the graph catalog.

Parameters:
  • graph_name (str) – Name of the generated graph.

  • node_count (int) – The number of nodes in the generated graph

  • average_degree (float) – The average out-degree of the generated nodes

  • relationship_distribution (str | None, default=None) – Determines the relationship distribution strategy.

  • relationship_seed (int | None, default=None) – Seed value for generating deterministic relationships.

  • relationship_property (RelationshipPropertySpec | None, default=None) – Configure generated relationship properties.

  • orientation (str | None, default=None) – Specifies the orientation of the generated relationships.

  • allow_self_loops (bool | None, default=None) – Whether nodes in the graph can have relationships where start and end nodes are the same.

  • read_concurrency (int | None, default=None) – Number of concurrent threads/processes to use during graph generation.

  • job_id (str | None) – Identifier for the computation.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

tuple of the generated graph object and the result object containing stats about the generation.

Return type:

GraphGenerationStats

abstract get(graph_name: str) GraphV2

Retrieve a handle to a graph from the graph catalog.

Parameters:

graph_name (str) – The name of the graph.

Returns:

A handle to the graph.

Return type:

GraphV2

abstract list(G: GraphV2 | str | None = None) list[GraphInfoWithDegrees]

List graphs in the graph catalog.

Parameters:

G (GraphV2 | str | None) – GraphV2 object or name to filter results. If None, list all graphs.

Returns:

List of graph metadata objects containing information like node count.

Return type:

list[GraphInfoWithDegrees]

abstract property node_labels: NodeLabelEndpoints

Endpoints for node label operations.

abstract property node_properties: NodePropertiesEndpoints

Endpoints for node label operations.

abstract property relationships: RelationshipsEndpoints

Endpoints for relationship operations.

abstract property sample: GraphSamplingEndpoints

Endpoints for graph sampling.

class graphdatascience.procedure_surface.api.catalog.GraphBackend
pydantic model graphdatascience.procedure_surface.api.catalog.GraphFilterResult
field from_graph_name: str
field graph_name: str
field node_count: int
field node_filter: str
field project_millis: int
field relationship_count: int
field relationship_filter: str
pydantic model graphdatascience.procedure_surface.api.catalog.GraphGenerationStats
Validators:
  • check_empty_property » relationship_property

field average_degree: float
field generate_millis: int
field name: str
field nodes: int
field relationship_distribution: str
field relationship_property: RelationshipPropertySpec | None
Validated by:
  • check_empty_property

field relationship_seed: int | None
field relationships: int
validator check_empty_property  »  relationship_property
Parameters:

value (Any)

Return type:

Any

pydantic model graphdatascience.procedure_surface.api.catalog.GraphInfo
Validators:
  • strip_timezone » creation_time

  • strip_timezone » modification_time

field configuration: dict[str, Any]
field creation_time: datetime
Validated by:
  • strip_timezone

field database: str
field database_location: str
field density: float
field graph_name: str
field graph_schema: dict[str, Any]
field memory_usage: str | None
field modification_time: datetime
Validated by:
  • strip_timezone

field node_count: int
field relationship_count: int
field size_in_bytes: int
validator strip_timezone  »  creation_time, modification_time
Parameters:

value (Any)

Return type:

Any

pydantic model graphdatascience.procedure_surface.api.catalog.GraphInfoWithDegrees
Validators:

field degree_distribution: dict[str, float | int]
class graphdatascience.procedure_surface.api.catalog.GraphSamplingEndpoints
abstract cnarw(G: GraphV2, graph_name: str, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) GraphWithSamplingResult

Common Neighbour Aware Random Walk (CNARW) samples the graph by taking random walks from a set of start nodes

CNARW is a graph sampling technique that involves optimizing the selection of the next-hop node. It takes into account the number of common neighbours between the current node and the next-hop candidates. On each step of a random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The resulting subgraph is stored as a new graph in the Graph Catalog.

Parameters:
  • G (GraphV2) – Graph object to use

  • graph_name (str) – The name of the new graph that is stored in the graph catalog.

  • start_nodes (list of int, optional) –

    IDs of the initial set of nodes in the original graph from which the sampling random walks will start.

    By default, a single node is chosen uniformly at random.

  • restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.

  • sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.

  • node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

tuple of the graph object and the result of the Common Neighbour Aware Random Walk (CNARW), including the dimensions of the sampled graph.

Return type:

GraphSamplingResult

abstract estimate(G: GraphV2, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of a CNARW algorithm run.

Parameters:
  • G (GraphV2) – Graph object to use

  • start_nodes (list of int, optional) – IDs of the initial set of nodes in the original graph from which the sampling random walks will start. By default, a single node is chosen uniformly at random.

  • restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.

  • sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.

  • node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract rwr(G: GraphV2, graph_name: str, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) GraphWithSamplingResult

Random walk with restarts (RWR) samples the graph by taking random walks from a set of start nodes.

On each step of a random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The resulting subgraph is stored as a new graph in the Graph Catalog.

Parameters:
  • G (GraphV2) – Graph object to use

  • graph_name (str) – The name of the new graph that is stored in the graph catalog.

  • start_nodes (list of int, optional) – IDs of the initial set of nodes in the original graph from which the sampling random walks will start. By default, a single node is chosen uniformly at random.

  • restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.

  • sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.

  • node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

tuple of the graph object and the result of the Random Walk with Restart (RWR), including the dimensions of the sampled graph.

Return type:

GraphWithSamplingResult

pydantic model graphdatascience.procedure_surface.api.catalog.GraphSamplingResult
field from_graph_name: str
field graph_name: str
field node_count: int
field project_millis: int
field relationship_count: int
field start_node_count: int
class graphdatascience.procedure_surface.api.catalog.GraphWithFilterResult

GraphWithFilterResult(graph, result)

static __new__(_cls, graph: GraphV2, result: GraphFilterResult)

Create new instance of GraphWithFilterResult(graph, result)

Parameters:
graph: GraphV2

Alias for field number 0

result: GraphFilterResult

Alias for field number 1

class graphdatascience.procedure_surface.api.catalog.GraphWithGenerationStats

GraphWithGenerationStats(graph, result)

static __new__(_cls, graph: GraphV2, result: GraphGenerationStats)

Create new instance of GraphWithGenerationStats(graph, result)

Parameters:
graph: GraphV2

Alias for field number 0

result: GraphGenerationStats

Alias for field number 1

class graphdatascience.procedure_surface.api.catalog.GraphWithSamplingResult

GraphWithSamplingResult(graph, result)

static __new__(_cls, graph: GraphV2, result: GraphSamplingResult)

Create new instance of GraphWithSamplingResult(graph, result)

Parameters:
graph: GraphV2

Alias for field number 0

result: GraphSamplingResult

Alias for field number 1

class graphdatascience.procedure_surface.api.catalog.NodeLabelEndpoints
abstract mutate(G: GraphV2, node_label: str, *, node_filter: str, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, write_concurrency: int | None = None, job_id: str | None = None) NodeLabelMutateResult

Attaches the specified node label to the filtered nodes in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_label (str) – The node label to write back.

  • node_filter (str) – A Cypher predicate for filtering nodes in the input graph.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

NodeLabelMutateResult

abstract write(G: GraphV2, node_label: str, *, node_filter: str, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, write_concurrency: int | None = None, job_id: str | None = None) NodeLabelWriteResult

Writes the specified node label to the filtered nodes in the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_label (str) – The node label to write back.

  • node_filter (str) – A Cypher predicate for filtering nodes in the input graph.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

NodeLabelWriteResult

pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelMutateResult
field mutate_millis: int
pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelPersistenceResult
field configuration: dict[str, object]
field graph_name: str
field node_count: int
field node_label: str
field node_labels_written: int
pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelWriteResult
field write_millis: int
pydantic model graphdatascience.procedure_surface.api.catalog.NodePropertiesDropResult
field graph_name: str
field node_properties: list[str]
field properties_removed: int
class graphdatascience.procedure_surface.api.catalog.NodePropertiesEndpoints
abstract drop(G: GraphV2, node_properties: list[str], *, fail_if_missing: bool | None = None, concurrency: int | None = None, username: str | None = None) NodePropertiesDropResult

Drops the specified node properties from the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_properties (list[str]) – The node properties to drop

  • fail_if_missing (bool | None = None,) – Whether to fail if any of the node properties are missing

  • concurrency (int | None) – Number of concurrent threads to use.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Execution metrics and statistics

Return type:

NodePropertiesDropResult

abstract stream(G: GraphV2, node_properties: str | list[str], *, list_node_labels: bool | None = None, node_labels: list[str] = ['*'], concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None, db_node_properties: list[str] | None = None) DataFrame

Streams the specified node properties from the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_properties (str | list[str]) – The node properties to stream

  • list_node_labels (boolean | None, default=None) – Whether to include node labels in the stream

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • job_id (str | None) – Identifier for the computation.

  • db_node_properties (list[str] | None, default=None) – Retrieves additional node properties from the database and attaches them to the stream.

Returns:

The streamed node properties

Return type:

DataFrame

abstract write(G: GraphV2, node_properties: str | list[str] | dict[str, str], *, node_labels: list[str] = ['*'], concurrency: int | None = None, write_concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) NodePropertiesWriteResult

Writes the specified node properties from the graph to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_properties (str | list[str] | dict[str, str]) –

    Node properties to write. Can be:

    • A string representing a single property name.

    • A list of strings representing multiple property names.

    • A dictionary mapping from property names in the GDS graph to property names in the database.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

NodePropertiesWriteResult

pydantic model graphdatascience.procedure_surface.api.catalog.NodePropertiesWriteResult
field configuration: dict[str, Any]
field graph_name: str
field node_properties: list[str]
field properties_written: int
field write_millis: int
class graphdatascience.procedure_surface.api.catalog.NodePropertySpec

NodePropertySpec(node_properties: ‘str | list[str] | dict[str, str]’) -> ‘None’

pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipPropertySpec
field max: float | None
field min: float | None
field name: str
field type: str
field value: float | None
static fixed(name: str, value: float) RelationshipPropertySpec
Parameters:
Return type:

RelationshipPropertySpec

static random(name: str, min: float, max: float) RelationshipPropertySpec
Parameters:
Return type:

RelationshipPropertySpec

pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsDropResult
field deleted_properties: dict[str, int]
field deleted_relationships: int
field graph_name: str
field relationship_type: str
class graphdatascience.procedure_surface.api.catalog.RelationshipsEndpoints
abstract collapse_path(G: GraphV2, path_templates: list[list[str]], mutate_relationship_type: str, *, allow_self_loops: bool = False, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) CollapsePathResult

Collapse each existing path in the graph into a single relationship.

Parameters:
  • G (GraphV2) – Graph object to use

  • path_templates (list[list[str]]) – A path template is an ordered list of relationship types used for the traversal. The same relationship type can be added multiple times, in order to traverse them as indicated. And, you may specify several path templates to process in one go.

  • mutate_relationship_type (str) – Name of the relationship type to store the results in.

  • allow_self_loops (bool, default=False) – Whether nodes in the graph can have relationships where start and end nodes are the same.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

CollapsePathResult

Return type:

meta data about the generated relationships.

abstract drop(G: GraphV2, relationship_type: str, *, fail_if_missing: bool = True) RelationshipsDropResult

Drops all relationships of the specified relationship type, including all their properties, from the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_type (str) – The relationship type to drop

  • fail_if_missing (bool, default=True) – If set to true, the procedure will fail if the relationship type does not exist in the graph.

Returns:

Execution metrics and statistics

Return type:

RelationshipsDropResult

abstract index_inverse(G: GraphV2, relationship_types: list[str], *, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsInverseIndexResult

Creates an index of the specified relationships indexing the reverse direction of each relationship. This index can be used by some algorithm to speed up the computation.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • sudo (bool = False,) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

RelationshipsInverseIndexResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], relationship_properties: list[str] | None = None, *, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) DataFrame

Streams all relationships of the specified types with the specified properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_properties (list[str] | None, default = None) – The relationship properties to stream. If not specified, no properties will be streamed.

  • concurrency (int | None) – Number of concurrent threads to use.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

The streamed relationships [sourceId, targetId, relationshipType] with a column for each property

Return type:

DataFrame

abstract to_undirected(G: GraphV2, relationship_type: str, mutate_relationship_type: str, *, aggregation: Aggregation | dict[str, Aggregation] | None = None, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsToUndirectedResult

Creates a new relationship type in the graph. The relationship will be based on an existing relationship type, however, the relationships will be stored undirected.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_type (str,) – The input relationship type

  • mutate_relationship_type (str,) – Name of the relationship type to store the results in.

  • aggregation (Aggregation | dict[str, Aggregation] | None = None,) – Specifies how to aggregate parallel relationships in the graph. If a single aggregation is provided, it will be used for properties of the specified relationships. A dictionary can be provided to specify property specific aggregations.

  • concurrency (int | None) – Number of concurrent threads to use.

  • sudo (bool = False,) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

RelationshipsInverseIndexResult

abstract write(G: GraphV2, relationship_type: str, relationship_properties: list[str] | None = None, *, concurrency: int | None = None, write_concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsWriteResult

Writes all relationships of the specified relationship type with the specified properties from the graph to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_type (str) – The relationship type to write to the database

  • relationship_properties (list[str] | None, default = None) – The relationship properties to write. If not specified, no properties will be written.

  • concurrency (int | None) – Number of concurrent threads to use.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • job_id (str | None) – Identifier for the computation.

Returns:

Execution metrics and statistics

Return type:

RelationshipsWriteResult

pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsInverseIndexResult
field compute_millis: int
field configuration: dict[str, Any]
field input_relationships: int
field mutate_millis: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsToUndirectedResult
field relationships_written: int
pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsWriteResult
Validators:
  • coerce_relationship_properties » relationship_properties

field configuration: dict[str, Any]
field graph_name: str
field properties_written: int
field relationship_properties: list[str]
Validated by:
  • coerce_relationship_properties

field relationship_type: str
field relationships_written: int
field write_millis: int
validator coerce_relationship_properties  »  relationship_properties
Parameters:

v (Any)

Return type:

list[str]

class graphdatascience.procedure_surface.api.catalog.ScalePropertiesEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • node_properties (Any) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ScalePropertiesMutateResult

Runs the Scale Properties algorithm and stores the results in the graph catalog as a new node property.

Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the scaler statistics

Return type:

ScalePropertiesMutateResult

abstract stats(G: GraphV2, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ScalePropertiesStatsResult

Runs the Scale Properties algorithm and returns result statistics without storing the results.

Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).

Parameters:
  • G (GraphV2) – Graph object to use

  • node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the scaler statistics

Return type:

ScalePropertiesStatsResult

abstract stream(G: GraphV2, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the Scale Properties algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with nodeId and scaledProperty columns containing scaled property values. Each row represents a node with its corresponding scaled property values.

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) ScalePropertiesWriteResult

Runs the Scale Properties algorithm and stores the result in the Neo4j database as a new node property.

Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ScalePropertiesWriteResult – Algorithm metrics and statistics including the scaler statistics and write timing

Return type:

ScalePropertiesWriteResult

pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesMutateResult

Result of running Scale Properties algorithm with mutate mode.

field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field scaler_statistics: dict[str, Any]
pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesStatsResult

Result of running Scale Properties algorithm with stats mode.

field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
field scaler_statistics: dict[str, Any]
pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesWriteResult

Result of running Scale Properties algorithm with write mode.

field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field scaler_statistics: dict[str, Any]
field write_millis: int
pydantic model graphdatascience.procedure_surface.api.catalog.ScalerConfig
field offset: int | float | None

The offset to add to the property values before applying the log transformation. Only used when type is ‘Log’.

field type: str

The type of scaler to use. Can be ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’.

class graphdatascience.graph.v2.graph_api.GraphV2

A graph object that represents a graph in the graph catalog. It can be passed into algorithm endpoints to compute over the corresponding graph. It contains summary information about the graph.

configuration() dict[str, Any]
Returns:

the configuration of the graph

Return type:

dict[str, Any]

creation_time() datetime
Returns:

the creation time of the graph

Return type:

datetime

degree_distribution() dict[str, float | int]
Returns:

the degree distribution of the graph

Return type:

dict[str, float | int]

density() float
Returns:

the density of the graph

Return type:

float

drop(failIfMissing: bool = True) GraphInfo | None
Parameters:

failIfMissing (bool) – whether to fail if the graph does not exist

Returns:

the result of the drop operation

Return type:

GraphInfo | None

exists() bool
Returns:

whether the graph exists

Return type:

bool

memory_usage() str | None
Returns:

the memory usage of the graph

Return type:

str | None

modification_time() datetime
Returns:

the modification time of the graph

Return type:

datetime

name() str
Returns:

the name of the graph

Return type:

str

node_count() int
Returns:

the number of nodes in the graph

Return type:

int

node_labels() list[str]
Returns:

the node labels in the graph

Return type:

list[str]

node_properties() dict[str, list[str]]
Returns:

the node properties per node label

Return type:

dict[str, list[str]]

relationship_count() int
Returns:

the number of relationships in the graph

Return type:

int

relationship_properties() dict[str, list[str]]
Returns:

the relationship properties per relationship type

Return type:

dict[str, list[str]]

relationship_types() list[str]
Returns:

the relationship types in the graph

Return type:

list[str]

size_in_bytes() int
Returns:

the size of the graph in bytes

Return type:

int