Community Algorithms

class graphdatascience.procedure_surface.api.community.CliqueCountingEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult

Estimates the memory requirements for running the clique counting algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

Returns:

The memory estimation result

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) CliqueCountingMutateResult

Executes the clique counting algorithm and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

CliqueCountingMutateResult

abstract stats(G: GraphV2, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) CliqueCountingStatsResult

Executes the clique counting algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

CliqueCountingStatsResult

abstract stream(G: GraphV2, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool | None = False, username: str | None = None) DataFrame

Executes the clique counting algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool | None) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) CliqueCountingWriteResult

Executes the clique counting algorithm and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • CliqueCountingWriteResult – Algorithm metrics and statistics

Return type:

CliqueCountingWriteResult

pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field global_count: list[int]
field mutate_millis: int
field node_properties_written: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingStatsResult
field compute_millis: int
field configuration: dict[str, Any]
field global_count: list[int]
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingWriteResult
field compute_millis: int
field configuration: dict[str, Any]
field global_count: list[int]
field node_properties_written: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.ConductanceEndpoints
abstract stream(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame

Executes the Conductance algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • community_property (str) – Name of the node property containing community assignments.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results containing ‘community’ and ‘conductance’ columns

Return type:

DataFrame

class graphdatascience.procedure_surface.api.community.HdbscanEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimates memory requirements and other statistics for the HDBSCAN algorithm.

This method provides memory estimation for the HDBSCAN algorithm without actually executing the clustering. It helps determine the computational requirements before running the actual clustering procedure.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • node_property (str) – The node property to use for clustering (required)

  • leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm

  • samples (int | None, default=None) – The number of samples used for density estimation

  • min_cluster_size (int | None, default=None) – The minimum size of clusters

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • log_progress – Display progress logging.

  • sudo – Disable the memory guard.

  • job_id – Identifier for the computation.

  • username – As an administrator, impersonate a different user for accessing their graphs.

Returns:

The estimation result containing memory requirements and other statistics

Return type:

EstimationResult

abstract mutate(G: GraphV2, node_property: str, mutate_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) HdbscanMutateResult

Runs the HDBSCAN algorithm and writes the cluster ID for each node back to the in-memory graph.

The algorithm performs hierarchical density-based clustering on a node property, identifying clusters based on density reachability.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering (required)

  • mutate_property (str) – Name of the node property to store the results in.

  • leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm

  • samples (int | None, default=None) – The number of samples used for density estimation

  • min_cluster_size (int | None, default=None) – The minimum size of clusters

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • log_progress (bool) – Display progress logging.

  • sudo (bool) – Disable the memory guard.

  • job_id (str | None) – Identifier for the computation.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

The result containing statistics about the clustering and algorithm execution

Return type:

HdbscanMutateResult

abstract stats(G: GraphV2, node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) HdbscanStatsResult

Runs the HDBSCAN algorithm and returns only statistics about the clustering.

This mode computes cluster assignments without writing them back to the graph, returning only execution statistics and cluster information.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering (required)

  • leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm

  • samples (int | None, default=None) – The number of samples used for density estimation

  • min_cluster_size (int | None, default=None) – The minimum size of clusters

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • log_progress (bool) – Display progress logging.

  • sudo (bool) – Disable the memory guard.

  • job_id (str | None) – Identifier for the computation.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

The result containing statistics about the clustering and algorithm execution

Return type:

HdbscanStatsResult

abstract stream(G: GraphV2, node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) DataFrame

Runs the HDBSCAN algorithm and returns the cluster ID for each node as a DataFrame.

The DataFrame contains the cluster assignment for each node, with noise points typically assigned to cluster -1.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering (required)

  • leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm

  • samples (int | None, default=None) – The number of samples used for density estimation

  • min_cluster_size (int | None, default=None) – The minimum size of clusters

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • log_progress (bool) – Display progress logging.

  • sudo (bool) – Disable the memory guard.

  • job_id (str | None) – Identifier for the computation.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

A DataFrame with columns ‘nodeId’ and ‘label’

Return type:

pd.DataFrame

abstract write(G: GraphV2, node_property: str, write_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None, write_concurrency: int | None = None) HdbscanWriteResult

Runs the HDBSCAN algorithm and writes the cluster ID for each node back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering (required)

  • write_property (str) – Name of the node property to store the results in.

  • leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm

  • samples (int | None, default=None) – The number of samples used for density estimation

  • min_cluster_size (int | None, default=None) – The minimum size of clusters

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • concurrency (int | None) – Number of concurrent threads to use.

  • log_progress (bool) – Display progress logging.

  • sudo (bool) – Disable the memory guard.

  • job_id (str | None) – Identifier for the computation.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

The result containing statistics about the clustering and algorithm execution

Return type:

HdbscanWriteResult

pydantic model graphdatascience.procedure_surface.api.community.HdbscanMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_count: int
field node_properties_written: int
field number_of_clusters: int
field number_of_noise_points: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.HdbscanStatsResult
field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field number_of_clusters: int
field number_of_noise_points: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.HdbscanWriteResult
field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field node_properties_written: int
field number_of_clusters: int
field number_of_noise_points: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.K1ColoringEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, batch_size: int = 10000, concurrency: int | None = None, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) K1ColoringMutateResult

Runs the K-1 Coloring algorithm and stores the results in the graph catalog as a new node property.

The K-1 Coloring algorithm assigns a color to every node in the graph, trying to optimize for two objectives: to make sure that every neighbor of a given node has a different color than the node itself, and to use as few colors as possible.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None, default=None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int, default=10) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

K1ColoringMutateResult

abstract stats(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) K1ColoringStatsResult

Executes the K-1 Coloring algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

K1ColoringStatsResult

abstract stream(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame

Executes the K-1 Coloring algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) K1ColoringWriteResult

Executes the K-1 Coloring algorithm and writes the results to the Neo4j database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • K1ColoringWriteResult – Algorithm metrics and statistics

Return type:

K1ColoringWriteResult

pydantic model graphdatascience.procedure_surface.api.community.K1ColoringMutateResult
field color_count: int
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_count: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.K1ColoringStatsResult
field color_count: int
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_count: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.K1ColoringWriteResult
field color_count: int
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_count: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.KCoreEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) KCoreMutateResult

Runs the K-Core Decomposition algorithm and stores the results in the graph catalog as a new node property.

The K-core decomposition constitutes a process that separates the nodes in a graph into groups based on the degree sequence and topology of the graph. The term i-core refers to a maximal subgraph of the original graph such that each node in this subgraph has degree at least i. Each node is associated with a core value which denotes the largest value i such that the node belongs to the i-core.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None, default=None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

KCoreMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) KCoreStatsResult

Executes the K-Core algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

KCoreStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame

Executes the K-Core algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with the algorithm results containing nodeId and coreValue

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) KCoreWriteResult

Executes the K-Core algorithm and writes the results to the Neo4j database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • KCoreWriteResult – Algorithm metrics and statistics

Return type:

KCoreWriteResult

pydantic model graphdatascience.procedure_surface.api.community.KCoreMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field degeneracy: int
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.KCoreStatsResult
field compute_millis: int
field configuration: dict[str, Any]
field degeneracy: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.KCoreWriteResult
field compute_millis: int
field configuration: dict[str, Any]
field degeneracy: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.KMeansEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', k: int = 10, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None) EstimationResult

Estimates the memory requirements for running the K-Means algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • node_property (str) – The node property to use for clustering

  • compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient

  • concurrency (int | None) – Number of concurrent threads to use.

  • delta_threshold (float) – Minimum change between iterations.

  • initial_sampler (str) – Sampling method for initial centroids.

  • k (int) – Number of clusters to form.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm

Returns:

The memory estimation result

Return type:

EstimationResult

abstract mutate(G: GraphV2, node_property: str, mutate_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) KMeansMutateResult

Executes the K-Means algorithm and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering

  • mutate_property (str) – Name of the node property to store the results in.

  • compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient

  • concurrency (int | None) – Number of concurrent threads to use.

  • delta_threshold (float) – Minimum change between iterations.

  • initial_sampler (str) – Sampling method for initial centroids.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of clusters to form.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

KMeansMutateResult

abstract stats(G: GraphV2, node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) KMeansStatsResult

Executes the K-Means algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering

  • compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient

  • concurrency (int | None) – Number of concurrent threads to use.

  • delta_threshold (float) – Minimum change between iterations.

  • initial_sampler (str) – Sampling method for initial centroids.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of clusters to form.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

KMeansStatsResult

abstract stream(G: GraphV2, node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) DataFrame

Executes the K-Means algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering

  • compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient

  • concurrency (int | None) – Number of concurrent threads to use.

  • delta_threshold (float) – Minimum change between iterations.

  • initial_sampler (str) – Sampling method for initial centroids.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of clusters to form.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results containing nodeId, communityId, distanceFromCentroid, and silhouette

Return type:

DataFrame

abstract write(G: GraphV2, node_property: str, write_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) KMeansWriteResult

Executes the K-Means algorithm and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • node_property (str) – The node property to use for clustering

  • write_property (str) – Name of the node property to store the results in.

  • compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient

  • concurrency (int | None) – Number of concurrent threads to use.

  • delta_threshold (float) – Minimum change between iterations.

  • initial_sampler (str) – Sampling method for initial centroids.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of clusters to form.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • KMeansWriteResult – Algorithm metrics and statistics

Return type:

KMeansWriteResult

pydantic model graphdatascience.procedure_surface.api.community.KMeansMutateResult
field average_distance_to_centroid: float
field average_silhouette: float
field centroids: list[list[float]]
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.KMeansStatsResult
field average_distance_to_centroid: float
field average_silhouette: float
field centroids: list[list[float]]
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.KMeansWriteResult
field average_distance_to_centroid: float
field average_silhouette: float
field centroids: list[list[float]]
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.LabelPropagationEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None) EstimationResult

Estimates the memory requirements for running the Label Propagation algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • node_weight_property (str | None, default=None) – The property name for node weights

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

Returns:

The memory estimation result

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) LabelPropagationMutateResult

Executes the Label Propagation algorithm and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • node_weight_property (str | None, default=None) – The property name for node weights

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

LabelPropagationMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) LabelPropagationStatsResult

Executes the Label Propagation algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • node_weight_property (str | None, default=None) – The property name for node weights

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

LabelPropagationStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame

Executes the Label Propagation algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • node_weight_property (str | None, default=None) – The property name for node weights

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results containing nodeId and communityId

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) LabelPropagationWriteResult

Executes the Label Propagation algorithm and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • node_weight_property (str | None, default=None) – The property name for node weights

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • LabelPropagationWriteResult – Algorithm metrics and statistics

Return type:

LabelPropagationWriteResult

pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationMutateResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationStatsResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationWriteResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.LeidenEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, theta: float = 0.01, tolerance: float = 0.0001) EstimationResult

Estimate the memory requirements for running the Leiden algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • max_levels (int, default=10) – The maximum number of levels

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • theta (float, default=0.01) – The theta parameter for the Leiden algorithm

  • tolerance (float) – Minimum change in scores between iterations.

Returns:

The memory estimation result

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) LeidenMutateResult

Executes the Leiden community detection algorithm and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_levels (int, default=10) – The maximum number of levels

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • theta (float, default=0.01) – The theta parameter for the Leiden algorithm

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

LeidenMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) LeidenStatsResult

Executes the Leiden community detection algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_levels (int, default=10) – The maximum number of levels

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • theta (float, default=0.01) – The theta parameter for the Leiden algorithm

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

LeidenStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) DataFrame

Executes the Leiden community detection algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_levels (int, default=10) – The maximum number of levels

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • theta (float, default=0.01) – The theta parameter for the Leiden algorithm

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

A DataFrame with columns: nodeId, communityId, intermediateCommunityIds

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None, write_concurrency: int | None = None) LeidenWriteResult

Executes the Leiden community detection algorithm and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_levels (int, default=10) – The maximum number of levels

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • theta (float, default=0.01) – The theta parameter for the Leiden algorithm

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • LeidenWriteResult – Algorithm metrics and statistics

Return type:

LeidenWriteResult

pydantic model graphdatascience.procedure_surface.api.community.LeidenMutateResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularities: list[float]
field modularity: float
field mutate_millis: int
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
pydantic model graphdatascience.procedure_surface.api.community.LeidenStatsResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularities: list[float]
field modularity: float
field node_count: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
pydantic model graphdatascience.procedure_surface.api.community.LeidenWriteResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularities: list[float]
field modularity: float
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientEndpoints

Interface for LocalClusteringCoefficient algorithm endpoints.

abstract estimate(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool | None = False, triangle_count_property: str | None = None, username: str | None = None) EstimationResult

Estimates the LocalClusteringCoefficient algorithm memory requirements.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool | None) – Disable the memory guard.

  • triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, *, mutate_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) LocalClusteringCoefficientMutateResult

Executes the LocalClusteringCoefficient algorithm and writes results back to the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Result containing clustering coefficient statistics and timing information

Return type:

LocalClusteringCoefficientMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) LocalClusteringCoefficientStatsResult

Executes the LocalClusteringCoefficient algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Result containing clustering coefficient statistics and timing information

Return type:

LocalClusteringCoefficientStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) DataFrame

Executes the LocalClusteringCoefficient algorithm and streams results.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame containing nodeId and localClusteringCoefficient columns

Return type:

pandas.DataFrame

abstract write(G: GraphV2, *, write_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None, write_concurrency: int | None = None) LocalClusteringCoefficientWriteResult

Executes the LocalClusteringCoefficient algorithm and writes results to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • LocalClusteringCoefficientWriteResult – Result containing clustering coefficient statistics and timing information

Return type:

LocalClusteringCoefficientWriteResult

pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientMutateResult
field average_clustering_coefficient: float
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientStatsResult
field average_clustering_coefficient: float
field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientWriteResult
field average_clustering_coefficient: float
field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.LouvainEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_levels (int, default=10) – The maximum number of levels in the hierarchy

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments

  • max_iterations (int) – Maximum number of iterations to run per level.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) LouvainMutateResult

Runs the Louvain algorithm and stores the results in the graph catalog as a new node property.

The Louvain method is an algorithm to detect communities in large networks. It maximizes a modularity score for each community, where the modularity quantifies the quality of an assignment of nodes to communities by evaluating how much more densely connected the nodes within a community are, compared to how connected they would be in a random network. The Louvain algorithm is a hierarchical clustering algorithm that recursively merges communities into a single node and runs the modularity clustering on the condensed graphs.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_levels (int, default=10) – The maximum number of levels in the hierarchy

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate communities

  • max_iterations (int) – Maximum number of iterations to run per level.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None, default=None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics

Return type:

LouvainMutateResult

abstract stats(G: GraphV2, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) LouvainStatsResult

Executes the Louvain algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • tolerance (float) – Minimum change in scores between iterations.

  • max_levels (int, default=10) – The maximum number of levels in the hierarchy

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments

  • max_iterations (int) – Maximum number of iterations to run per level.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics

Return type:

LouvainStatsResult

abstract stream(G: GraphV2, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, min_community_size: int | None = None) DataFrame

Executes the Louvain algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • tolerance (float) – Minimum change in scores between iterations.

  • max_levels (int, default=10) – The maximum number of levels in the hierarchy

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments

  • max_iterations (int) – Maximum number of iterations to run per level.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

Returns:

DataFrame with the algorithm results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, write_concurrency: int | None = None, min_community_size: int | None = None) LouvainWriteResult

Executes the Louvain algorithm and writes the results to the Neo4j database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_levels (int, default=10) – The maximum number of levels in the hierarchy

  • include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments

  • max_iterations (int) – Maximum number of iterations to run per level.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

Returns:

Algorithm metrics and statistics

Return type:

LouvainWriteResult

pydantic model graphdatascience.procedure_surface.api.community.LouvainMutateResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field modularities: list[float]
field modularity: float
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
pydantic model graphdatascience.procedure_surface.api.community.LouvainStatsResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field modularities: list[float]
field modularity: float
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
pydantic model graphdatascience.procedure_surface.api.community.LouvainWriteResult
field community_count: int
field community_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field modularities: list[float]
field modularity: float
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_levels: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.MaxKCutEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, iterations: int = 8, k: int = 2, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, random_seed: int | None = None, vns_max_neighborhood_order: int = 0) EstimationResult

Estimate the memory requirements for running the Approximate Maximum k-cut algorithm.

This method provides memory estimates without actually running the algorithm, helping you determine if you have sufficient memory available.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • iterations (int) – Number of iterations to run.

  • k (int) – Number of communities to detect.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search

Returns:

The memory estimation result including required memory in bytes and as heap percentage

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, iterations: int = 8, job_id: str | None = None, k: int = 2, log_progress: bool = True, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None, vns_max_neighborhood_order: int = 0) MaxKCutMutateResult

Executes the Approximate Maximum k-cut algorithm and writes the results to the in-memory graph as node properties.

The Approximate Maximum k-cut algorithm is a community detection algorithm that partitions a graph into k communities such that the sum of weights of edges between different communities is maximized. It uses a variable neighborhood search (VNS) approach to find high-quality cuts.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • iterations (int) – Number of iterations to run.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of communities to detect.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search. Higher values may lead to better results but increase computation time.

Returns:

Algorithm metrics and statistics including the cut cost and processing times

Return type:

MaxKCutMutateResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, iterations: int = 8, job_id: str | None = None, k: int = 2, log_progress: bool = True, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None, vns_max_neighborhood_order: int = 0) DataFrame

Executes the Approximate Maximum k-cut algorithm and returns a stream of results.

The Approximate Maximum k-cut algorithm partitions a graph into k communities to maximize the cut cost. This method returns the community assignment for each node as a stream.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • iterations (int) – Number of iterations to run.

  • job_id (str | None) – Identifier for the computation.

  • k (int) – Number of communities to detect.

  • log_progress (bool) – Display progress logging.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search. Higher values may lead to better results but increase computation time.

Returns:

A DataFrame with columns: - nodeId: The node identifier - communityId: The community assignment for the node

Return type:

DataFrame

pydantic model graphdatascience.procedure_surface.api.community.MaxKCutMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field cut_cost: float
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
class graphdatascience.procedure_surface.api.community.ModularityEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], community_property: str, *, concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None) EstimationResult

Estimate the memory consumption of the modularity algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • community_property (str) – Name of the node property containing community assignments.

  • concurrency (int | None) – Number of concurrent threads to use.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract stats(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) ModularityStatsResult

Executes the Modularity algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • community_property (str) – Name of the node property containing community assignments.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm statistics including communityCount, modularity, nodeCount, and relationshipCount

Return type:

ModularityStatsResult

abstract stream(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame

Executes the Modularity algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • community_property (str) – Name of the node property containing community assignments.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results containing ‘communityId’ and ‘modularity’ columns

Return type:

DataFrame

class graphdatascience.procedure_surface.api.community.ModularityOptimizationEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, tolerance: float = 0.0001) EstimationResult

Estimates the memory consumption for running the Modularity Optimization algorithm.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • tolerance (float) – Minimum change in scores between iterations.

Returns:

Estimated memory consumption and other metrics

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) ModularityOptimizationMutateResult

Executes the Modularity Optimization algorithm and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Result containing community statistics and timing information

Return type:

ModularityOptimizationMutateResult

abstract stats(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) ModularityOptimizationStatsResult

Executes the Modularity Optimization algorithm and returns statistics about the communities.

Parameters:
  • G (GraphV2) – Graph object to use

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Result containing community statistics and timing information

Return type:

ModularityOptimizationStatsResult

abstract stream(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) DataFrame

Executes the Modularity Optimization algorithm and returns the results as a DataFrame.

Parameters:
  • G (GraphV2) – Graph object to use

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

A DataFrame with columns ‘nodeId’ and ‘communityId’

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None, write_concurrency: int | None = None) ModularityOptimizationWriteResult

Executes the Modularity Optimization algorithm and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • batch_size (int) – Number of nodes to process in each batch.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • max_iterations (int) – Maximum number of iterations to run.

  • min_community_size (int | None) – Minimum size for communities to be included in results.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • sudo (bool) – Disable the memory guard.

  • tolerance (float) – Minimum change in scores between iterations.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ModularityOptimizationWriteResult – Result containing community statistics and timing information

Return type:

ModularityOptimizationWriteResult

pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationMutateResult
field community_count: int
field community_distribution: dict[str, float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularity: float
field mutate_millis: int
field nodes: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationStatsResult
field community_count: int
field community_distribution: dict[str, float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularity: float
field nodes: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationWriteResult
field community_count: int
field community_distribution: dict[str, float]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field modularity: float
field nodes: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
pydantic model graphdatascience.procedure_surface.api.community.ModularityStatsResult

Result object for the modularity stats algorithm.

field community_count: int
field compute_millis: int
field configuration: dict[str, Any]
field modularity: float
field node_count: int
field post_processing_millis: int
field pre_processing_millis: int
field relationship_count: int
class graphdatascience.procedure_surface.api.community.SccEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SccMutateResult

Runs the Strongly Connected Components algorithm and stores the results in the graph catalog as a new node property.

The Strongly Connected Components (SCC) algorithm finds maximal sets of connected nodes in a directed graph. A set is considered a strongly connected component if there is a directed path between each pair of nodes within the set.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None, default=None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

SccMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SccStatsResult

Executes the SCC algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics

Return type:

SccStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame

Executes the SCC algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

DataFrame with the algorithm results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) SccWriteResult

Executes the SCC algorithm and writes the results to the Neo4j database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • SccWriteResult – Algorithm metrics and statistics

Return type:

SccWriteResult

pydantic model graphdatascience.procedure_surface.api.community.SccMutateResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.SccStatsResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.SccWriteResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.SllpaEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, max_iterations: int, concurrency: int | None = None, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*']) EstimationResult

Estimates the memory consumption for running the Speaker-Listener Label Propagation algorithm (SLLPA).

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • min_association_strength (float | None, default=None) – Minimum association strength for community assignment

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • partitioning (str | None) – Partitioning configuration for the algorithm

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • max_iterations (int)

Returns:

An object containing the memory estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SllpaMutateResult

Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and writes the results to the in-memory graph as node properties.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • max_iterations (int) – Maximum number of iterations to run.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • min_association_strength (float | None, default=None) – Minimum association strength for community assignment

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • partitioning (str | None) – Partitioning configuration for the algorithm

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

An object containing metadata about the algorithm execution and the mutation

Return type:

SllpaMutateResult

abstract stats(G: GraphV2, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SllpaStatsResult

Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and returns statistics about the communities.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • min_association_strength (float | None, default=None) – Minimum association strength for community assignment

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • partitioning (str | None) – Partitioning configuration for the algorithm

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • max_iterations (int)

Returns:

An object containing statistics about the algorithm execution

Return type:

SllpaStatsResult

abstract stream(G: GraphV2, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame

Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and returns the results as a DataFrame.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • min_association_strength (float | None, default=None) – Minimum association strength for community assignment

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • partitioning (str | None) – Partitioning configuration for the algorithm

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • max_iterations (int)

Returns:

DataFrame containing node IDs and their community values

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) SllpaWriteResult

Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and writes the results back to the database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • log_progress (bool) – Display progress logging.

  • min_association_strength (float | None, default=None) – Minimum association strength for community assignment

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • partitioning (str | None) – Partitioning configuration for the algorithm

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • SllpaWriteResult – An object containing metadata about the algorithm execution and the write operation

  • max_iterations (int)

Return type:

SllpaWriteResult

pydantic model graphdatascience.procedure_surface.api.community.SllpaMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_properties_written: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.SllpaStatsResult
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.community.SllpaWriteResult
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_properties_written: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.TriangleCountEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, label_filter: list[str] | None = None, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult

Estimate the memory requirements for running the Triangle Count algorithm.

This method provides memory estimates without actually running the algorithm, helping you determine if you have sufficient memory available.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • concurrency (int | None) – Number of concurrent threads to use.

  • label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.

  • max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

Returns:

The memory estimation result including required memory in bytes and as heap percentage

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) TriangleCountMutateResult

Executes the Triangle Count algorithm and writes the results to the in-memory graph as node properties.

The Triangle Count algorithm computes the number of triangles each node participates in.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.

  • log_progress (bool) – Display progress logging.

  • max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm metrics and statistics including the global triangle count and processing times

Return type:

TriangleCountMutateResult

abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) TriangleCountStatsResult

Executes the Triangle Count algorithm and returns statistics about the computation.

This method computes triangle counts without storing results in the graph, providing aggregate statistics about the triangle structure of the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.

  • log_progress (bool) – Display progress logging.

  • max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

Algorithm statistics including the global triangle count and processing times

Return type:

TriangleCountStatsResult

abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame

Executes the Triangle Count algorithm and returns a stream of results.

The Triangle Count algorithm computes the number of triangles each node participates in. This method returns the triangle count for each node as a stream.

Parameters:
  • G (GraphV2) – Graph object to use

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.

  • log_progress (bool) – Display progress logging.

  • max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

Returns:

A DataFrame with columns: - nodeId: The node identifier - triangleCount: The number of triangles the node participates in

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) TriangleCountWriteResult

Executes the Triangle Count algorithm and writes the results back to the database.

This method computes triangle counts and writes the results directly to the Neo4j database as node properties, making them available for subsequent Cypher queries.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.

  • log_progress (bool) – Display progress logging.

  • max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • sudo (bool) – Disable the memory guard.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.

Returns:

Algorithm metrics and statistics including the global triangle count and processing times

Return type:

TriangleCountWriteResult

pydantic model graphdatascience.procedure_surface.api.community.TriangleCountMutateResult
field compute_millis: int
field configuration: dict[str, Any]
field global_triangle_count: int
field mutate_millis: int
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.TriangleCountStatsResult
field compute_millis: int
field configuration: dict[str, Any]
field global_triangle_count: int
field node_count: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.TriangleCountWriteResult
field compute_millis: int
field configuration: dict[str, Any]
field global_triangle_count: int
field node_count: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.community.WccEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) WccMutateResult

Runs the Weakly Connected Components algorithm and stores the results in the graph catalog as a new node property.

The Weakly Connected Components (WCC) algorithm finds sets of connected nodes in directed and undirected graphs where two nodes are connected if there exists a path between them. In contrast to Strongly Connected Components (SCC), the direction of relationships on the path between two nodes is not considered.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None, default=None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics

Return type:

WccMutateResult

abstract stats(G: GraphV2, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) WccStatsResult

Executes the WCC algorithm and returns statistics.

Parameters:
  • G (GraphV2) – Graph object to use

  • threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics

Return type:

WccStatsResult

abstract stream(G: GraphV2, min_component_size: int | None = None, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) DataFrame

Executes the WCC algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • min_component_size (int | None, default=None) – Don’t stream components with fewer nodes than this

  • threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

DataFrame with the algorithm results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, min_component_size: int | None = None, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, write_concurrency: int | None = None) WccWriteResult

Executes the WCC algorithm and writes the results to the Neo4j database.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • min_component_size (int | None, default=None) – Don’t write components with fewer nodes than this

  • threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • seed_property (str | None) – Name of the property to be used to for the initial value of a node.

  • consecutive_ids (bool) – Use consecutive IDs for the components.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • WccWriteResult – Algorithm metrics and statistics

Return type:

WccWriteResult

pydantic model graphdatascience.procedure_surface.api.community.WccMutateResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.WccStatsResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.community.WccWriteResult
field component_count: int
field component_distribution: dict[str, int | float]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int