Community Algorithms¶
- class graphdatascience.procedure_surface.api.community.CliqueCountingEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult¶
Estimates the memory requirements for running the clique counting algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
- Returns:
The memory estimation result
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) CliqueCountingMutateResult¶
Executes the clique counting algorithm and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) CliqueCountingStatsResult¶
Executes the clique counting algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool | None = False, username: str | None = None) DataFrame¶
Executes the clique counting algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool | None) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) CliqueCountingWriteResult¶
Executes the clique counting algorithm and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
CliqueCountingWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.CliqueCountingWriteResult¶
- class graphdatascience.procedure_surface.api.community.ConductanceEndpoints¶
- abstract stream(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame¶
Executes the Conductance algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
community_property (str) – Name of the node property containing community assignments.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results containing ‘community’ and ‘conductance’ columns
- Return type:
DataFrame
- class graphdatascience.procedure_surface.api.community.HdbscanEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult¶
Estimates memory requirements and other statistics for the HDBSCAN algorithm.
This method provides memory estimation for the HDBSCAN algorithm without actually executing the clustering. It helps determine the computational requirements before running the actual clustering procedure.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
node_property (str) – The node property to use for clustering (required)
leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm
samples (int | None, default=None) – The number of samples used for density estimation
min_cluster_size (int | None, default=None) – The minimum size of clusters
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
log_progress – Display progress logging.
sudo – Disable the memory guard.
job_id – Identifier for the computation.
username – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
The estimation result containing memory requirements and other statistics
- Return type:
- abstract mutate(G: GraphV2, node_property: str, mutate_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) HdbscanMutateResult¶
Runs the HDBSCAN algorithm and writes the cluster ID for each node back to the in-memory graph.
The algorithm performs hierarchical density-based clustering on a node property, identifying clusters based on density reachability.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering (required)
mutate_property (str) – Name of the node property to store the results in.
leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm
samples (int | None, default=None) – The number of samples used for density estimation
min_cluster_size (int | None, default=None) – The minimum size of clusters
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
log_progress (bool) – Display progress logging.
sudo (bool) – Disable the memory guard.
job_id (str | None) – Identifier for the computation.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
The result containing statistics about the clustering and algorithm execution
- Return type:
- abstract stats(G: GraphV2, node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) HdbscanStatsResult¶
Runs the HDBSCAN algorithm and returns only statistics about the clustering.
This mode computes cluster assignments without writing them back to the graph, returning only execution statistics and cluster information.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering (required)
leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm
samples (int | None, default=None) – The number of samples used for density estimation
min_cluster_size (int | None, default=None) – The minimum size of clusters
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
log_progress (bool) – Display progress logging.
sudo (bool) – Disable the memory guard.
job_id (str | None) – Identifier for the computation.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
The result containing statistics about the clustering and algorithm execution
- Return type:
- abstract stream(G: GraphV2, node_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None) DataFrame¶
Runs the HDBSCAN algorithm and returns the cluster ID for each node as a DataFrame.
The DataFrame contains the cluster assignment for each node, with noise points typically assigned to cluster -1.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering (required)
leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm
samples (int | None, default=None) – The number of samples used for density estimation
min_cluster_size (int | None, default=None) – The minimum size of clusters
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
log_progress (bool) – Display progress logging.
sudo (bool) – Disable the memory guard.
job_id (str | None) – Identifier for the computation.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
A DataFrame with columns ‘nodeId’ and ‘label’
- Return type:
pd.DataFrame
- abstract write(G: GraphV2, node_property: str, write_property: str, *, leaf_size: int = 1, samples: int = 10, min_cluster_size: int = 5, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, log_progress: bool = True, sudo: bool = False, job_id: str | None = None, username: str | None = None, write_concurrency: int | None = None) HdbscanWriteResult¶
Runs the HDBSCAN algorithm and writes the cluster ID for each node back to the database.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering (required)
write_property (str) – Name of the node property to store the results in.
leaf_size (int | None, default=None) – The maximum leaf size of the tree structure used in the algorithm
samples (int | None, default=None) – The number of samples used for density estimation
min_cluster_size (int | None, default=None) – The minimum size of clusters
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
concurrency (int | None) – Number of concurrent threads to use.
log_progress (bool) – Display progress logging.
sudo (bool) – Disable the memory guard.
job_id (str | None) – Identifier for the computation.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
The result containing statistics about the clustering and algorithm execution
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.HdbscanMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.HdbscanStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.HdbscanWriteResult¶
- class graphdatascience.procedure_surface.api.community.K1ColoringEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, batch_size: int = 10000, concurrency: int | None = None, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) K1ColoringMutateResult¶
Runs the K-1 Coloring algorithm and stores the results in the graph catalog as a new node property.
The K-1 Coloring algorithm assigns a color to every node in the graph, trying to optimize for two objectives: to make sure that every neighbor of a given node has a different color than the node itself, and to use as few colors as possible.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None, default=None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int, default=10) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) K1ColoringStatsResult¶
Executes the K-1 Coloring algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame¶
Executes the K-1 Coloring algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, batch_size: int = 10000, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) K1ColoringWriteResult¶
Executes the K-1 Coloring algorithm and writes the results to the Neo4j database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
K1ColoringWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.K1ColoringMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.K1ColoringStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.K1ColoringWriteResult¶
- class graphdatascience.procedure_surface.api.community.KCoreEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) KCoreMutateResult¶
Runs the K-Core Decomposition algorithm and stores the results in the graph catalog as a new node property.
The K-core decomposition constitutes a process that separates the nodes in a graph into groups based on the degree sequence and topology of the graph. The term i-core refers to a maximal subgraph of the original graph such that each node in this subgraph has degree at least i. Each node is associated with a core value which denotes the largest value i such that the node belongs to the i-core.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None, default=None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) KCoreStatsResult¶
Executes the K-Core algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame¶
Executes the K-Core algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
DataFrame with the algorithm results containing nodeId and coreValue
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) KCoreWriteResult¶
Executes the K-Core algorithm and writes the results to the Neo4j database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
KCoreWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.KCoreMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.KCoreStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.KCoreWriteResult¶
- class graphdatascience.procedure_surface.api.community.KMeansEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', k: int = 10, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None) EstimationResult¶
Estimates the memory requirements for running the K-Means algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
node_property (str) – The node property to use for clustering
compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient
concurrency (int | None) – Number of concurrent threads to use.
delta_threshold (float) – Minimum change between iterations.
initial_sampler (str) – Sampling method for initial centroids.
k (int) – Number of clusters to form.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm
- Returns:
The memory estimation result
- Return type:
- abstract mutate(G: GraphV2, node_property: str, mutate_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) KMeansMutateResult¶
Executes the K-Means algorithm and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering
mutate_property (str) – Name of the node property to store the results in.
compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient
concurrency (int | None) – Number of concurrent threads to use.
delta_threshold (float) – Minimum change between iterations.
initial_sampler (str) – Sampling method for initial centroids.
job_id (str | None) – Identifier for the computation.
k (int) – Number of clusters to form.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) KMeansStatsResult¶
Executes the K-Means algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering
compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient
concurrency (int | None) – Number of concurrent threads to use.
delta_threshold (float) – Minimum change between iterations.
initial_sampler (str) – Sampling method for initial centroids.
job_id (str | None) – Identifier for the computation.
k (int) – Number of clusters to form.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, node_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None) DataFrame¶
Executes the K-Means algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering
compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient
concurrency (int | None) – Number of concurrent threads to use.
delta_threshold (float) – Minimum change between iterations.
initial_sampler (str) – Sampling method for initial centroids.
job_id (str | None) – Identifier for the computation.
k (int) – Number of clusters to form.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results containing nodeId, communityId, distanceFromCentroid, and silhouette
- Return type:
DataFrame
- abstract write(G: GraphV2, node_property: str, write_property: str, *, compute_silhouette: bool = False, concurrency: int | None = None, delta_threshold: float = 0.05, initial_sampler: str = 'UNIFORM', job_id: str | None = None, k: int = 10, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], number_of_restarts: int = 1, random_seed: int | None = None, relationship_types: list[str] = ['*'], seed_centroids: list[list[float]] | None = None, sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) KMeansWriteResult¶
Executes the K-Means algorithm and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
node_property (str) – The node property to use for clustering
write_property (str) – Name of the node property to store the results in.
compute_silhouette (bool | None, default=False) – Whether to compute silhouette coefficient
concurrency (int | None) – Number of concurrent threads to use.
delta_threshold (float) – Minimum change between iterations.
initial_sampler (str) – Sampling method for initial centroids.
job_id (str | None) – Identifier for the computation.
k (int) – Number of clusters to form.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
number_of_restarts (int | None, default=1) – The number of times the algorithm should be restarted with different initial centers
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
seed_centroids (Optional[list[list[float]]], default=None) – Initial centroids for the algorithm
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
KMeansWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.KMeansMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.KMeansStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.KMeansWriteResult¶
- class graphdatascience.procedure_surface.api.community.LabelPropagationEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None) EstimationResult¶
Estimates the memory requirements for running the Label Propagation algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
node_weight_property (str | None, default=None) – The property name for node weights
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
- Returns:
The memory estimation result
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) LabelPropagationMutateResult¶
Executes the Label Propagation algorithm and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
node_weight_property (str | None, default=None) – The property name for node weights
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) LabelPropagationStatsResult¶
Executes the Label Propagation algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
node_weight_property (str | None, default=None) – The property name for node weights
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame¶
Executes the Label Propagation algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
node_weight_property (str | None, default=None) – The property name for node weights
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results containing nodeId and communityId
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], node_weight_property: str | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) LabelPropagationWriteResult¶
Executes the Label Propagation algorithm and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
node_weight_property (str | None, default=None) – The property name for node weights
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
LabelPropagationWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LabelPropagationWriteResult¶
- class graphdatascience.procedure_surface.api.community.LeidenEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, theta: float = 0.01, tolerance: float = 0.0001) EstimationResult¶
Estimate the memory requirements for running the Leiden algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
max_levels (int, default=10) – The maximum number of levels
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
theta (float, default=0.01) – The theta parameter for the Leiden algorithm
tolerance (float) – Minimum change in scores between iterations.
- Returns:
The memory estimation result
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) LeidenMutateResult¶
Executes the Leiden community detection algorithm and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_levels (int, default=10) – The maximum number of levels
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
theta (float, default=0.01) – The theta parameter for the Leiden algorithm
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) LeidenStatsResult¶
Executes the Leiden community detection algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_levels (int, default=10) – The maximum number of levels
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
theta (float, default=0.01) – The theta parameter for the Leiden algorithm
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None) DataFrame¶
Executes the Leiden community detection algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_levels (int, default=10) – The maximum number of levels
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
theta (float, default=0.01) – The theta parameter for the Leiden algorithm
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
A DataFrame with columns: nodeId, communityId, intermediateCommunityIds
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, gamma: float = 1.0, include_intermediate_communities: bool = False, job_id: str | None = None, log_progress: bool = True, max_levels: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, theta: float = 0.01, tolerance: float = 0.0001, username: str | None = None, write_concurrency: int | None = None) LeidenWriteResult¶
Executes the Leiden community detection algorithm and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
gamma (float, default=1.0) – The gamma parameter for the Leiden algorithm
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_levels (int, default=10) – The maximum number of levels
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
theta (float, default=0.01) – The theta parameter for the Leiden algorithm
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
LeidenWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.LeidenMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LeidenStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LeidenWriteResult¶
- class graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientEndpoints¶
Interface for LocalClusteringCoefficient algorithm endpoints.
- abstract estimate(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool | None = False, triangle_count_property: str | None = None, username: str | None = None) EstimationResult¶
Estimates the LocalClusteringCoefficient algorithm memory requirements.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool | None) – Disable the memory guard.
triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, *, mutate_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) LocalClusteringCoefficientMutateResult¶
Executes the LocalClusteringCoefficient algorithm and writes results back to the graph.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Result containing clustering coefficient statistics and timing information
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) LocalClusteringCoefficientStatsResult¶
Executes the LocalClusteringCoefficient algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Result containing clustering coefficient statistics and timing information
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None) DataFrame¶
Executes the LocalClusteringCoefficient algorithm and streams results.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame containing nodeId and localClusteringCoefficient columns
- Return type:
- abstract write(G: GraphV2, *, write_property: str, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, triangle_count_property: str | None = None, username: str | None = None, write_concurrency: int | None = None) LocalClusteringCoefficientWriteResult¶
Executes the LocalClusteringCoefficient algorithm and writes results to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
triangle_count_property (str | None, default=None) – Property name for pre-computed triangle counts
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
LocalClusteringCoefficientWriteResult – Result containing clustering coefficient statistics and timing information
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LocalClusteringCoefficientWriteResult¶
- class graphdatascience.procedure_surface.api.community.LouvainEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
tolerance (float) – Minimum change in scores between iterations.
max_levels (int, default=10) – The maximum number of levels in the hierarchy
include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments
max_iterations (int) – Maximum number of iterations to run per level.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
An object containing the result of the estimation
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) LouvainMutateResult¶
Runs the Louvain algorithm and stores the results in the graph catalog as a new node property.
The Louvain method is an algorithm to detect communities in large networks. It maximizes a modularity score for each community, where the modularity quantifies the quality of an assignment of nodes to communities by evaluating how much more densely connected the nodes within a community are, compared to how connected they would be in a random network. The Louvain algorithm is a hierarchical clustering algorithm that recursively merges communities into a single node and runs the modularity clustering on the condensed graphs.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
tolerance (float) – Minimum change in scores between iterations.
max_levels (int, default=10) – The maximum number of levels in the hierarchy
include_intermediate_communities (bool, default=False) – Whether to include intermediate communities
max_iterations (int) – Maximum number of iterations to run per level.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None, default=None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) LouvainStatsResult¶
Executes the Louvain algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
tolerance (float) – Minimum change in scores between iterations.
max_levels (int, default=10) – The maximum number of levels in the hierarchy
include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments
max_iterations (int) – Maximum number of iterations to run per level.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, min_community_size: int | None = None) DataFrame¶
Executes the Louvain algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
tolerance (float) – Minimum change in scores between iterations.
max_levels (int, default=10) – The maximum number of levels in the hierarchy
include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments
max_iterations (int) – Maximum number of iterations to run per level.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
min_community_size (int | None) – Minimum size for communities to be included in results.
- Returns:
DataFrame with the algorithm results
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, tolerance: float = 0.0001, max_levels: int = 10, include_intermediate_communities: bool = False, max_iterations: int = 10, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, write_concurrency: int | None = None, min_community_size: int | None = None) LouvainWriteResult¶
Executes the Louvain algorithm and writes the results to the Neo4j database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
tolerance (float) – Minimum change in scores between iterations.
max_levels (int, default=10) – The maximum number of levels in the hierarchy
include_intermediate_communities (bool, default=False) – Whether to include intermediate community assignments
max_iterations (int) – Maximum number of iterations to run per level.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
min_community_size (int | None) – Minimum size for communities to be included in results.
- Returns:
Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.LouvainMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LouvainStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.LouvainWriteResult¶
- class graphdatascience.procedure_surface.api.community.MaxKCutEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, iterations: int = 8, k: int = 2, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, random_seed: int | None = None, vns_max_neighborhood_order: int = 0) EstimationResult¶
Estimate the memory requirements for running the Approximate Maximum k-cut algorithm.
This method provides memory estimates without actually running the algorithm, helping you determine if you have sufficient memory available.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
iterations (int) – Number of iterations to run.
k (int) – Number of communities to detect.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search
- Returns:
The memory estimation result including required memory in bytes and as heap percentage
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, iterations: int = 8, job_id: str | None = None, k: int = 2, log_progress: bool = True, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None, vns_max_neighborhood_order: int = 0) MaxKCutMutateResult¶
Executes the Approximate Maximum k-cut algorithm and writes the results to the in-memory graph as node properties.
The Approximate Maximum k-cut algorithm is a community detection algorithm that partitions a graph into k communities such that the sum of weights of edges between different communities is maximized. It uses a variable neighborhood search (VNS) approach to find high-quality cuts.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
iterations (int) – Number of iterations to run.
job_id (str | None) – Identifier for the computation.
k (int) – Number of communities to detect.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search. Higher values may lead to better results but increase computation time.
- Returns:
Algorithm metrics and statistics including the cut cost and processing times
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, iterations: int = 8, job_id: str | None = None, k: int = 2, log_progress: bool = True, min_community_size: int | None = None, node_labels: list[str] = ['*'], random_seed: int | None = None, relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None, vns_max_neighborhood_order: int = 0) DataFrame¶
Executes the Approximate Maximum k-cut algorithm and returns a stream of results.
The Approximate Maximum k-cut algorithm partitions a graph into k communities to maximize the cut cost. This method returns the community assignment for each node as a stream.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
iterations (int) – Number of iterations to run.
job_id (str | None) – Identifier for the computation.
k (int) – Number of communities to detect.
log_progress (bool) – Display progress logging.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
vns_max_neighborhood_order (int | None, default=0) – The maximum neighborhood order for the Variable Neighborhood Search. Higher values may lead to better results but increase computation time.
- Returns:
A DataFrame with columns: - nodeId: The node identifier - communityId: The community assignment for the node
- Return type:
DataFrame
- pydantic model graphdatascience.procedure_surface.api.community.MaxKCutMutateResult¶
- class graphdatascience.procedure_surface.api.community.ModularityEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], community_property: str, *, concurrency: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None) EstimationResult¶
Estimate the memory consumption of the modularity algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
community_property (str) – Name of the node property containing community assignments.
concurrency (int | None) – Number of concurrent threads to use.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
An object containing the result of the estimation
- Return type:
- abstract stats(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) ModularityStatsResult¶
Executes the Modularity algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
community_property (str) – Name of the node property containing community assignments.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm statistics including communityCount, modularity, nodeCount, and relationshipCount
- Return type:
- abstract stream(G: GraphV2, community_property: str, *, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, sudo: bool = False, username: str | None = None) DataFrame¶
Executes the Modularity algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
community_property (str) – Name of the node property containing community assignments.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results containing ‘communityId’ and ‘modularity’ columns
- Return type:
DataFrame
- class graphdatascience.procedure_surface.api.community.ModularityOptimizationEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, tolerance: float = 0.0001) EstimationResult¶
Estimates the memory consumption for running the Modularity Optimization algorithm.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
tolerance (float) – Minimum change in scores between iterations.
- Returns:
Estimated memory consumption and other metrics
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) ModularityOptimizationMutateResult¶
Executes the Modularity Optimization algorithm and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Result containing community statistics and timing information
- Return type:
- abstract stats(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) ModularityOptimizationStatsResult¶
Executes the Modularity Optimization algorithm and returns statistics about the communities.
- Parameters:
G (GraphV2) – Graph object to use
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Result containing community statistics and timing information
- Return type:
- abstract stream(G: GraphV2, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None) DataFrame¶
Executes the Modularity Optimization algorithm and returns the results as a DataFrame.
- Parameters:
G (GraphV2) – Graph object to use
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
A DataFrame with columns ‘nodeId’ and ‘communityId’
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, batch_size: int = 10000, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, max_iterations: int = 10, min_community_size: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], relationship_weight_property: str | None = None, seed_property: str | None = None, sudo: bool = False, tolerance: float = 0.0001, username: str | None = None, write_concurrency: int | None = None) ModularityOptimizationWriteResult¶
Executes the Modularity Optimization algorithm and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
batch_size (int) – Number of nodes to process in each batch.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
max_iterations (int) – Maximum number of iterations to run.
min_community_size (int | None) – Minimum size for communities to be included in results.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_weight_property (str | None) – Name of the property to be used as weights.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
sudo (bool) – Disable the memory guard.
tolerance (float) – Minimum change in scores between iterations.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ModularityOptimizationWriteResult – Result containing community statistics and timing information
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.ModularityOptimizationWriteResult¶
- pydantic model graphdatascience.procedure_surface.api.community.ModularityStatsResult¶
Result object for the modularity stats algorithm.
- class graphdatascience.procedure_surface.api.community.SccEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, consecutive_ids: bool = False, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SccMutateResult¶
Runs the Strongly Connected Components algorithm and stores the results in the graph catalog as a new node property.
The Strongly Connected Components (SCC) algorithm finds maximal sets of connected nodes in a directed graph. A set is considered a strongly connected component if there is a directed path between each pair of nodes within the set.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None, default=None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SccStatsResult¶
Executes the SCC algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame¶
Executes the SCC algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
DataFrame with the algorithm results
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, consecutive_ids: bool = False, job_id: str | None = None, log_progress: bool = True, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) SccWriteResult¶
Executes the SCC algorithm and writes the results to the Neo4j database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
consecutive_ids (bool) – Use consecutive IDs for the components.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
SccWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.SccMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.SccStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.SccWriteResult¶
- class graphdatascience.procedure_surface.api.community.SllpaEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, max_iterations: int, concurrency: int | None = None, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*']) EstimationResult¶
Estimates the memory consumption for running the Speaker-Listener Label Propagation algorithm (SLLPA).
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
min_association_strength (float | None, default=None) – Minimum association strength for community assignment
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
partitioning (str | None) – Partitioning configuration for the algorithm
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
max_iterations (int)
- Returns:
An object containing the memory estimation
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SllpaMutateResult¶
Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and writes the results to the in-memory graph as node properties.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
max_iterations (int) – Maximum number of iterations to run.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
min_association_strength (float | None, default=None) – Minimum association strength for community assignment
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
partitioning (str | None) – Partitioning configuration for the algorithm
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
An object containing metadata about the algorithm execution and the mutation
- Return type:
- abstract stats(G: GraphV2, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) SllpaStatsResult¶
Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and returns statistics about the communities.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
min_association_strength (float | None, default=None) – Minimum association strength for community assignment
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
partitioning (str | None) – Partitioning configuration for the algorithm
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
max_iterations (int)
- Returns:
An object containing statistics about the algorithm execution
- Return type:
- abstract stream(G: GraphV2, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame¶
Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and returns the results as a DataFrame.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
min_association_strength (float | None, default=None) – Minimum association strength for community assignment
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
partitioning (str | None) – Partitioning configuration for the algorithm
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
max_iterations (int)
- Returns:
DataFrame containing node IDs and their community values
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, max_iterations: int, concurrency: int | None = None, job_id: str | None = None, log_progress: bool = True, min_association_strength: float = 0.2, node_labels: list[str] = ['*'], partitioning: str = 'RANGE', relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) SllpaWriteResult¶
Executes the Speaker-Listener Label Propagation algorithm (SLLPA) and writes the results back to the database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
log_progress (bool) – Display progress logging.
min_association_strength (float | None, default=None) – Minimum association strength for community assignment
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
partitioning (str | None) – Partitioning configuration for the algorithm
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
SllpaWriteResult – An object containing metadata about the algorithm execution and the write operation
max_iterations (int)
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.SllpaMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.SllpaStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.SllpaWriteResult¶
- class graphdatascience.procedure_surface.api.community.TriangleCountEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], *, concurrency: int | None = None, label_filter: list[str] | None = None, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*']) EstimationResult¶
Estimate the memory requirements for running the Triangle Count algorithm.
This method provides memory estimates without actually running the algorithm, helping you determine if you have sufficient memory available.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
concurrency (int | None) – Number of concurrent threads to use.
label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.
max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
- Returns:
The memory estimation result including required memory in bytes and as heap percentage
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) TriangleCountMutateResult¶
Executes the Triangle Count algorithm and writes the results to the in-memory graph as node properties.
The Triangle Count algorithm computes the number of triangles each node participates in.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.
log_progress (bool) – Display progress logging.
max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm metrics and statistics including the global triangle count and processing times
- Return type:
- abstract stats(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) TriangleCountStatsResult¶
Executes the Triangle Count algorithm and returns statistics about the computation.
This method computes triangle counts without storing results in the graph, providing aggregate statistics about the triangle structure of the graph.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.
log_progress (bool) – Display progress logging.
max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
Algorithm statistics including the global triangle count and processing times
- Return type:
- abstract stream(G: GraphV2, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None) DataFrame¶
Executes the Triangle Count algorithm and returns a stream of results.
The Triangle Count algorithm computes the number of triangles each node participates in. This method returns the triangle count for each node as a stream.
- Parameters:
G (GraphV2) – Graph object to use
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.
log_progress (bool) – Display progress logging.
max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
A DataFrame with columns: - nodeId: The node identifier - triangleCount: The number of triangles the node participates in
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, *, concurrency: int | None = None, job_id: str | None = None, label_filter: list[str] | None = None, log_progress: bool = True, max_degree: int | None = None, node_labels: list[str] = ['*'], relationship_types: list[str] = ['*'], sudo: bool = False, username: str | None = None, write_concurrency: int | None = None) TriangleCountWriteResult¶
Executes the Triangle Count algorithm and writes the results back to the database.
This method computes triangle counts and writes the results directly to the Neo4j database as node properties, making them available for subsequent Cypher queries.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
label_filter (list[str] | None, default=None) – Filter triangles by node labels. Only triangles where all nodes have one of the specified labels will be counted.
log_progress (bool) – Display progress logging.
max_degree (int | None, default=None) – Maximum degree of nodes to consider. Nodes with higher degrees will be excluded from triangle counting to improve performance.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
sudo (bool) – Disable the memory guard.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
- Returns:
Algorithm metrics and statistics including the global triangle count and processing times
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.TriangleCountMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.TriangleCountStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.TriangleCountWriteResult¶
- class graphdatascience.procedure_surface.api.community.WccEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) WccMutateResult¶
Runs the Weakly Connected Components algorithm and stores the results in the graph catalog as a new node property.
The Weakly Connected Components (WCC) algorithm finds sets of connected nodes in directed and undirected graphs where two nodes are connected if there exists a path between them. In contrast to Strongly Connected Components (SCC), the direction of relationships on the path between two nodes is not considered.
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None, default=None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stats(G: GraphV2, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) WccStatsResult¶
Executes the WCC algorithm and returns statistics.
- Parameters:
G (GraphV2) – Graph object to use
threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
Algorithm metrics and statistics
- Return type:
- abstract stream(G: GraphV2, min_component_size: int | None = None, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None) DataFrame¶
Executes the WCC algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
min_component_size (int | None, default=None) – Don’t stream components with fewer nodes than this
threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
- Returns:
DataFrame with the algorithm results
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, min_component_size: int | None = None, threshold: float = 0.0, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, seed_property: str | None = None, consecutive_ids: bool = False, relationship_weight_property: str | None = None, write_concurrency: int | None = None) WccWriteResult¶
Executes the WCC algorithm and writes the results to the Neo4j database.
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
min_component_size (int | None, default=None) – Don’t write components with fewer nodes than this
threshold (float, default=0.0) – The minimum required weight to consider a relationship during traversal
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
seed_property (str | None) – Name of the property to be used to for the initial value of a node.
consecutive_ids (bool) – Use consecutive IDs for the components.
relationship_weight_property (str | None) – Name of the property to be used as weights.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
WccWriteResult – Algorithm metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.community.WccMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.community.WccStatsResult¶
- pydantic model graphdatascience.procedure_surface.api.community.WccWriteResult¶