Catalog Arrow Endpoints¶

class graphdatascience.procedure_surface.arrow.catalog.catalog_arrow_endpoints.CatalogArrowEndpoints¶

__new__(**kwargs)¶

Construct a graph from a list of node and relationship dataframes.

Parameters:

graph_name (str) – Name of the graph to construct
nodes (DataFrame | list[DataFrame]) –
Node dataframes. A dataframe should follow the schema:
- nodeId to identify uniquely the node overall dataframes
- labels to specify the labels of the node as a list of strings (optional)
- other columns are treated as node properties
relationships (DataFrame | list[DataFrame] | None) –
Relationship dataframes. A dataframe should follow the schema:
- sourceNodeId to identify the start node of the relationship
- targetNodeId to identify the end node of the relationship
- relationshipType to specify the type of the relationship (optional)
- other columns are treated as relationship properties
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str] | None) – List of relationship types to treat as undirected.

Returns:

Constructed graph object.

Return type:

GraphV2

property datasets: DatasetEndpoints¶: Endpoints for loading predefined datasets into the graph catalog.

drop(G: GraphV2 | str, fail_if_missing: bool = True) → GraphInfo | None¶

Drop a graph from the graph catalog.

Parameters:

G (GraphV2 | str) – Graph to drop by name of object.
fail_if_missing (bool) – Whether to fail if the graph is missing

Returns:

GraphV2 metadata object containing information like node count.

Return type:

GraphListResult

filter(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, parameters: dict[str, Any] | None = None, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → GraphWithFilterResult¶

Create a subgraph of a graph based on a filter expression.

Parameters:

G (GraphV2) – Graph object to use
(str) (relationship_filter) – Name of subgraph to create
(str) – Filter expression for nodes
(str) – Filter expression for relationships
parameters (dict[str, Any] | None) – A map of user-defined query parameters that are passed into the node and relationship filters.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
graph_name (str)
node_filter (str)
relationship_filter (str)

Returns:

tuple of the filtered graph object and the information like graph name, node count, relationship count, etc.

Return type:

GraphWithFilterResult

filter_async(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, parameters: dict[str, Any] | None = None, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → ProjectionJobHandle¶

Kick off a graph filter operation and return a ProjectionJobHandle.

Unlike filter(), this method does not block on completion.

Parameters:

G (GraphV2)
graph_name (str)
node_filter (str)
relationship_filter (str)
parameters (dict[str, Any] | None)
concurrency (int | None)
job_id (str | None)
sudo (bool)
log_progress (bool)
username (str | None)

Return type:

ProjectionJobHandle

generate(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = 'UNIFORM', relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = 'NATURAL', aggregation: str | None = 'NONE', allow_self_loops: bool | None = False, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → GraphWithGenerationStats¶

Generates a random graph and store it in the graph catalog.

Parameters:

graph_name (str) – Name of the generated graph.
node_count (int) – The number of nodes in the generated graph
average_degree (float) – The average out-degree of the generated nodes
relationship_distribution (str | None) – Determines the relationship distribution strategy.
relationship_seed (int | None) – Seed value for generating deterministic relationships.
relationship_property (RelationshipPropertySpec | None) – Configure generated relationship properties.
orientation (str | None) – Specifies the orientation of the generated relationships.
aggregation (str | None) – The relationship aggregation method of Relationship Projection.
allow_self_loops (bool | None) – Whether nodes in the graph can have relationships where start and end nodes are the same.
read_concurrency (int | None) – Number of concurrent threads/processes to use during graph generation.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

tuple of the generated graph object and the result object containing stats about the generation.

Return type:

GraphGenerationStats

generate_async(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = 'UNIFORM', relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = 'NATURAL', aggregation: str | None = 'NONE', allow_self_loops: bool | None = False, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → ProjectionJobHandle¶

Kick off a graph generation and return a ProjectionJobHandle.

Unlike generate(), this method does not block on completion.

Parameters:

graph_name (str)
node_count (int)
average_degree (float)
relationship_distribution (str | None)
relationship_seed (int | None)
relationship_property (RelationshipPropertySpec | None)
orientation (str | None)
aggregation (str | None)
allow_self_loops (bool | None)
read_concurrency (int | None)
job_id (str | None)
sudo (bool)
log_progress (bool)
username (str | None)

Return type:

ProjectionJobHandle

get(graph_name: str) → GraphV2¶

Retrieve a handle to a graph from the graph catalog.

Parameters:: graph_name (str) – The name of the graph.
Returns:: A handle to the graph.
Return type:: GraphV2

list(G: GraphV2 | str | None = None) → list[GraphInfoWithDegrees]¶

List graphs in the graph catalog.

Parameters:: G (GraphV2 | str | None) – GraphV2 object or name to filter results. If None, list all graphs.
Returns:: List of graph metadata objects containing information like node count.
Return type:: list[GraphInfoWithDegrees]

property node_labels: NodeLabelEndpoints¶: Endpoints for node label operations.

property node_properties: NodePropertiesEndpoints¶: Endpoints for node label operations.

project(graph_name: str, query: str, *, query_parameters: dict[str, Any] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None, logging: bool = True) → GraphWithProjectResult¶

Projects a graph from the Neo4j database into the GDS graph catalog.

Parameters:

graph_name (str) – Name of the graph to be created in the catalog.
query (str) – Cypher query to select nodes and relationships for the graph projection. Must contain gds.graph.project.remote. Example: MATCH (n)–>(m) RETURN gds.graph.project.remote(n, m)
query_parameters (dict[str, Any] | None) – Parameters that will be passed to the Cypher query.
job_id (str | None) – Identifier for the computation.
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str]) – List of relationship types to treat as undirected.
inverse_indexed_relationship_types (list[str]) – List of relationship types to index in both directions.
batch_size (int | None, default=None) – Number of rows to process in each batch when projecting the graph.
logging (bool, default=True) – Whether to log progress during graph projection.

Returns:

A result object containing information about the projected graph.

Return type:

ProjectionResult

project_async(graph_name: str, query: str, *, query_parameters: dict[str, Any] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None) → ProjectionJobHandle¶

Kick off a cypher graph projection and return a ProjectionJobHandle.

Unlike project(), this method does not block on completion. Use the returned handle to query status or retrieve the projected graph and result.

Parameters:

graph_name (str)
query (str)
query_parameters (dict[str, Any] | None)
job_id (str | None)
concurrency (int | None)
undirected_relationship_types (List[str] | None)
inverse_indexed_relationship_types (List[str] | None)
batch_size (int | None)

Return type:

ProjectionJobHandle

project_native(graph_name: str, node_label_filter: List[str], relationship_type_filter: List[str], *, node_properties: List[str] | None = None, relationship_properties: List[str] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None, logging: bool = True) → GraphWithProjectResult¶

Projects a graph from the Neo4j database into the GDS graph catalog.

Parameters:

graph_name (str) – Name of the graph to be created in the catalog.
node_label_filter (list[str]) – List of node labels to include in the graph projection.
relationship_type_filter (list[str]) – List of relationship types to include in the graph projection.
node_properties (list[str]) – List of node properties to include in the graph projection.
relationship_properties (list[str]) – List of relationship properties to include in the graph projection.
job_id (str | None) – Identifier for the computation.
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str]) – List of relationship types to treat as undirected.
inverse_indexed_relationship_types (list[str]) – List of relationship types to index in both directions.
batch_size (int | None, default=None) – Number of rows to process in each batch when projecting the graph.
logging (bool, default=True) – Whether to log progress during graph projection.

Returns:

A result object containing information about the projected graph.

Return type:

ProjectionResult

project_native_async(graph_name: str, node_label_filter: List[str], relationship_type_filter: List[str], *, node_properties: List[str] | None = None, relationship_properties: List[str] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None) → ProjectionJobHandle¶

Kick off a native graph projection and return a ProjectionJobHandle.

Unlike project_native(), this method does not block on completion. The returned handle can be used to await completion and retrieve the projected graph and result.

Parameters:

graph_name (str)
node_label_filter (List[str])
relationship_type_filter (List[str])
node_properties (List[str] | None)
relationship_properties (List[str] | None)
job_id (str | None)
concurrency (int | None)
undirected_relationship_types (List[str] | None)
inverse_indexed_relationship_types (List[str] | None)
batch_size (int | None)

Return type:

ProjectionJobHandle

property relationships: RelationshipsEndpoints¶: Endpoints for relationship operations.

property sample: GraphSamplingEndpoints¶: Endpoints for graph sampling.

class graphdatascience.procedure_surface.arrow.catalog.CatalogArrowEndpoints¶

Construct a graph from a list of node and relationship dataframes.

Parameters:

graph_name (str) – Name of the graph to construct
nodes (DataFrame | list[DataFrame]) –
Node dataframes. A dataframe should follow the schema:
- nodeId to identify uniquely the node overall dataframes
- labels to specify the labels of the node as a list of strings (optional)
- other columns are treated as node properties
relationships (DataFrame | list[DataFrame] | None) –
Relationship dataframes. A dataframe should follow the schema:
- sourceNodeId to identify the start node of the relationship
- targetNodeId to identify the end node of the relationship
- relationshipType to specify the type of the relationship (optional)
- other columns are treated as relationship properties
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str] | None) – List of relationship types to treat as undirected.

Returns:

Constructed graph object.

Return type:

GraphV2

drop(G: GraphV2 | str, fail_if_missing: bool = True) → GraphInfo | None¶

Drop a graph from the graph catalog.

Parameters:

G (GraphV2 | str) – Graph to drop by name of object.
fail_if_missing (bool) – Whether to fail if the graph is missing

Returns:

GraphV2 metadata object containing information like node count.

Return type:

GraphListResult

filter(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, parameters: dict[str, Any] | None = None, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → GraphWithFilterResult¶

Create a subgraph of a graph based on a filter expression.

Parameters:

G (GraphV2) – Graph object to use
(str) (relationship_filter) – Name of subgraph to create
(str) – Filter expression for nodes
(str) – Filter expression for relationships
parameters (dict[str, Any] | None) – A map of user-defined query parameters that are passed into the node and relationship filters.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
graph_name (str)
node_filter (str)
relationship_filter (str)

Returns:

tuple of the filtered graph object and the information like graph name, node count, relationship count, etc.

Return type:

GraphWithFilterResult

filter_async(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, parameters: dict[str, Any] | None = None, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → ProjectionJobHandle¶

Kick off a graph filter operation and return a ProjectionJobHandle.

Unlike filter(), this method does not block on completion.

Parameters:

G (GraphV2)
graph_name (str)
node_filter (str)
relationship_filter (str)
parameters (dict[str, Any] | None)
concurrency (int | None)
job_id (str | None)
sudo (bool)
log_progress (bool)
username (str | None)

Return type:

ProjectionJobHandle

generate(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = 'UNIFORM', relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = 'NATURAL', aggregation: str | None = 'NONE', allow_self_loops: bool | None = False, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → GraphWithGenerationStats¶

Generates a random graph and store it in the graph catalog.

Parameters:

graph_name (str) – Name of the generated graph.
node_count (int) – The number of nodes in the generated graph
average_degree (float) – The average out-degree of the generated nodes
relationship_distribution (str | None) – Determines the relationship distribution strategy.
relationship_seed (int | None) – Seed value for generating deterministic relationships.
relationship_property (RelationshipPropertySpec | None) – Configure generated relationship properties.
orientation (str | None) – Specifies the orientation of the generated relationships.
aggregation (str | None) – The relationship aggregation method of Relationship Projection.
allow_self_loops (bool | None) – Whether nodes in the graph can have relationships where start and end nodes are the same.
read_concurrency (int | None) – Number of concurrent threads/processes to use during graph generation.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

tuple of the generated graph object and the result object containing stats about the generation.

Return type:

GraphGenerationStats

generate_async(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = 'UNIFORM', relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = 'NATURAL', aggregation: str | None = 'NONE', allow_self_loops: bool | None = False, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) → ProjectionJobHandle¶

Kick off a graph generation and return a ProjectionJobHandle.

Unlike generate(), this method does not block on completion.

Parameters:

graph_name (str)
node_count (int)
average_degree (float)
relationship_distribution (str | None)
relationship_seed (int | None)
relationship_property (RelationshipPropertySpec | None)
orientation (str | None)
aggregation (str | None)
allow_self_loops (bool | None)
read_concurrency (int | None)
job_id (str | None)
sudo (bool)
log_progress (bool)
username (str | None)

Return type:

ProjectionJobHandle

get(graph_name: str) → GraphV2¶

Retrieve a handle to a graph from the graph catalog.

Parameters:: graph_name (str) – The name of the graph.
Returns:: A handle to the graph.
Return type:: GraphV2

list(G: GraphV2 | str | None = None) → list[GraphInfoWithDegrees]¶

List graphs in the graph catalog.

Parameters:: G (GraphV2 | str | None) – GraphV2 object or name to filter results. If None, list all graphs.
Returns:: List of graph metadata objects containing information like node count.
Return type:: list[GraphInfoWithDegrees]

property node_labels: NodeLabelEndpoints¶: Endpoints for node label operations.

property node_properties: NodePropertiesEndpoints¶: Endpoints for node label operations.

project(graph_name: str, query: str, *, query_parameters: dict[str, Any] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None, logging: bool = True) → GraphWithProjectResult¶

Projects a graph from the Neo4j database into the GDS graph catalog.

Parameters:

graph_name (str) – Name of the graph to be created in the catalog.
query (str) – Cypher query to select nodes and relationships for the graph projection. Must contain gds.graph.project.remote. Example: MATCH (n)–>(m) RETURN gds.graph.project.remote(n, m)
query_parameters (dict[str, Any] | None) – Parameters that will be passed to the Cypher query.
job_id (str | None) – Identifier for the computation.
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str]) – List of relationship types to treat as undirected.
inverse_indexed_relationship_types (list[str]) – List of relationship types to index in both directions.
batch_size (int | None, default=None) – Number of rows to process in each batch when projecting the graph.
logging (bool, default=True) – Whether to log progress during graph projection.

Returns:

A result object containing information about the projected graph.

Return type:

ProjectionResult

project_async(graph_name: str, query: str, *, query_parameters: dict[str, Any] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None) → ProjectionJobHandle¶

Kick off a cypher graph projection and return a ProjectionJobHandle.

Unlike project(), this method does not block on completion. Use the returned handle to query status or retrieve the projected graph and result.

Parameters:

graph_name (str)
query (str)
query_parameters (dict[str, Any] | None)
job_id (str | None)
concurrency (int | None)
undirected_relationship_types (List[str] | None)
inverse_indexed_relationship_types (List[str] | None)
batch_size (int | None)

Return type:

ProjectionJobHandle

project_native(graph_name: str, node_label_filter: List[str], relationship_type_filter: List[str], *, node_properties: List[str] | None = None, relationship_properties: List[str] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None, logging: bool = True) → GraphWithProjectResult¶

Projects a graph from the Neo4j database into the GDS graph catalog.

Parameters:

graph_name (str) – Name of the graph to be created in the catalog.
node_label_filter (list[str]) – List of node labels to include in the graph projection.
relationship_type_filter (list[str]) – List of relationship types to include in the graph projection.
node_properties (list[str]) – List of node properties to include in the graph projection.
relationship_properties (list[str]) – List of relationship properties to include in the graph projection.
job_id (str | None) – Identifier for the computation.
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str]) – List of relationship types to treat as undirected.
inverse_indexed_relationship_types (list[str]) – List of relationship types to index in both directions.
batch_size (int | None, default=None) – Number of rows to process in each batch when projecting the graph.
logging (bool, default=True) – Whether to log progress during graph projection.

Returns:

A result object containing information about the projected graph.

Return type:

ProjectionResult

project_native_async(graph_name: str, node_label_filter: List[str], relationship_type_filter: List[str], *, node_properties: List[str] | None = None, relationship_properties: List[str] | None = None, job_id: str | None = None, concurrency: int | None = None, undirected_relationship_types: List[str] | None = None, inverse_indexed_relationship_types: List[str] | None = None, batch_size: int | None = None) → ProjectionJobHandle¶

Kick off a native graph projection and return a ProjectionJobHandle.

Unlike project_native(), this method does not block on completion. The returned handle can be used to await completion and retrieve the projected graph and result.

Parameters:

graph_name (str)
node_label_filter (List[str])
relationship_type_filter (List[str])
node_properties (List[str] | None)
relationship_properties (List[str] | None)
job_id (str | None)
concurrency (int | None)
undirected_relationship_types (List[str] | None)
inverse_indexed_relationship_types (List[str] | None)
batch_size (int | None)

Return type:

ProjectionJobHandle

property relationships: RelationshipsEndpoints¶: Endpoints for relationship operations.

property sample: GraphSamplingEndpoints¶: Endpoints for graph sampling.

class graphdatascience.procedure_surface.arrow.catalog.GraphWithProjectResult¶

Result object for graph projection jobs, containing the projected graph and the projection result. Can be used as a context manager to ensure the projected graph is dropped after use.

static __new__(_cls, graph: GraphV2, result: ProjectionResult | StoreProjectionResult)¶

Create new instance of GraphWithProjectResult(graph, result)

Parameters:

graph (GraphV2)
result (ProjectionResult | StoreProjectionResult)

graph: GraphV2¶: Alias for field number 0

result: ProjectionResult | StoreProjectionResult¶: Alias for field number 1

pydantic model graphdatascience.procedure_surface.arrow.catalog.ProjectionResult¶

Result object for graph projection jobs.

field configuration: dict[str, Any]¶

field graph_name: str¶

field node_count: int¶

field project_millis: int¶

field query: str¶

field relationship_count: int¶