Centrality Algorithms

class graphdatascience.procedure_surface.api.centrality.ArticleRankEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • damping_factor (float) – The damping factor controls the probability of a random jump to a random node

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) ArticleRankMutateResult

Runs the Article Rank algorithm and stores the results in the graph catalog as a new node property.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm metrics and statistics

Return type:

ArticleRankMutateResult

abstract stats(G: GraphV2, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) ArticleRankStatsResult

Runs the Article Rank algorithm and returns result statistics without storing the results.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:
  • G (GraphV2) – Graph object to use

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm statistics

Return type:

ArticleRankStatsResult

abstract stream(G: GraphV2, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) DataFrame

Executes the ArticleRank algorithm and returns the results as a stream.

Parameters:
  • G (GraphV2) – Graph object to use

  • damping_factor (float) – The damping factor controls the probability of a random jump to a random node

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

DataFrame with node IDs and their ArticleRank scores

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None, write_concurrency: int | None = None) ArticleRankWriteResult

Runs the Article Rank algorithm and stores the result in the Neo4j database as a new node property.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ArticleRankWriteResult – Algorithm metrics and statistics

Return type:

ArticleRankWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankMutateResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankStatsResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankWriteResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.ArticulationPointsEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation including memory requirements

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ArticulationPointsMutateResult

Runs the Articulation Points algorithm and stores the results in the graph catalog as a new node property.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the count of articulation points found

Return type:

ArticulationPointsMutateResult

abstract stats(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ArticulationPointsStatsResult

Runs the Articulation Points algorithm and returns result statistics without storing the results.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the count of articulation points found

Return type:

ArticulationPointsStatsResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the ArticulationPoints algorithm and returns results as a stream.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

A DataFrame containing articulation points with columns: - nodeId: The ID of the articulation point - resultingComponents: Information about resulting components

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) ArticulationPointsWriteResult

Runs the Articulation Points algorithm and stores the result in the Neo4j database as a new node property.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ArticulationPointsWriteResult – Algorithm metrics and statistics including the count of articulation points found

Return type:

ArticulationPointsWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsMutateResult

Result of running ArticulationPoints algorithm with mutate mode.

field articulation_point_count: int
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsStatsResult

Result of running ArticulationPoints algorithm with stats mode.

field articulation_point_count: int
field compute_millis: int
field configuration: dict[str, Any]
pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsWriteResult

Result of running ArticulationPoints algorithm with write mode.

field articulation_point_count: int
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.BetweennessEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • sampling_size (int | None, default=None) – The number of nodes to use for sampling.

  • sampling_seed (int | None, default=None) – The seed value for sampling randomization

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) BetweennessMutateResult

Runs the Betweenness Centrality algorithm and stores the results in the graph catalog as a new node property.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.

  • sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics including centrality distribution

Return type:

BetweennessMutateResult

abstract stats(G: GraphV2, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) BetweennessStatsResult

Runs the Betweenness Centrality algorithm and returns result statistics without storing the results.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:
  • G (GraphV2) – Graph object to use

  • sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.

  • sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm statistics including centrality distribution

Return type:

BetweennessStatsResult

abstract stream(G: GraphV2, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) DataFrame

Executes the Betweenness Centrality algorithm and returns the results as a stream.

Parameters:
  • G (GraphV2) – Graph object to use

  • sampling_size (int | None, default=None) – The number of nodes to use for sampling.

  • sampling_seed (int | None, default=None) – The seed value for sampling randomization

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

DataFrame with nodeId and score columns containing betweenness centrality results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, write_concurrency: int | None = None) BetweennessWriteResult

Runs the Betweenness Centrality algorithm and stores the result in the Neo4j database as a new node property.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.

  • sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • BetweennessWriteResult – Algorithm metrics and statistics including centrality distribution

Return type:

BetweennessWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessMutateResult

Result of running Betweenness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessStatsResult

Result of running Betweenness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessWriteResult

Result of running Betweenness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.BridgesEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the Bridges algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with ‘from’, ‘to’ and ‘remainingSizes’ columns. The remainingSizes column contains the sizes of the remaining connected components after removing the bridge relationship.

Return type:

DataFrame

class graphdatascience.procedure_surface.api.centrality.CelfEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], seed_set_size: int, propagation_probability: float = 0.1, monte_carlo_simulations: int = 100, random_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • seed_set_size (int) – The number of nodes to select as the seed set for influence maximization.

  • propagation_probability (float) – The probability that influence spreads from one node to another.

  • monte_carlo_simulations (int) – The number of Monte-Carlo simulations.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation including memory requirements

Return type:

EstimationResult

abstract mutate(G: GraphV2, seed_set_size: int, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) CelfMutateResult

Runs the CELF algorithm and stores the results in the graph catalog as a new node property.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:
  • G (GraphV2) – Graph object to use

  • seed_set_size (int) – The number of nodes to select as the seed set for influence maximization

  • mutate_property (str) – Name of the node property to store the results in.

  • propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.

  • monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the total influence spread

Return type:

CelfMutateResult

abstract stats(G: GraphV2, seed_set_size: int, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) CelfStatsResult

Runs the CELF algorithm and returns result statistics without storing the results.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:
  • G (GraphV2) – Graph object to use

  • seed_set_size (int) – The number of nodes to select as the seed set for influence maximization

  • propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.

  • monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the total influence spread

Return type:

CelfStatsResult

abstract stream(G: GraphV2, seed_set_size: int, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the CELF algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • seed_set_size (int) – The number of nodes to select as the seed set for influence maximization

  • propagation_probability (float | None, default=None) – The probability that influence spreads from one node to another.

  • monte_carlo_simulations (int | None, default=None) – The number of Monte-Carlo simulations.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with nodeId and spread columns containing CELF results. Each row represents a selected node with its corresponding influence spread value.

Return type:

DataFrame

abstract write(G: GraphV2, seed_set_size: int, write_property: str, propagation_probability: float = 0.1, monte_carlo_simulations: int = 100, random_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) CelfWriteResult

Runs the CELF algorithm and stores the result in the Neo4j database as a new node property.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:
  • G (GraphV2) – Graph object to use

  • seed_set_size (int) – The number of nodes to select as the seed set for influence maximization

  • write_property (str) – Name of the node property to store the results in.

  • propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.

  • monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.

  • random_seed (int | None) – Seed for random number generation to ensure reproducible results.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • CelfWriteResult – Algorithm metrics and statistics including the total influence spread and write timing

Return type:

CelfWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.CelfMutateResult

Result of running CELF algorithm with mutate mode.

field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_count: int
field node_properties_written: int
field total_spread: float
pydantic model graphdatascience.procedure_surface.api.centrality.CelfStatsResult

Result of running CELF algorithm with stats mode.

field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field total_spread: float
pydantic model graphdatascience.procedure_surface.api.centrality.CelfWriteResult

Result of running CELF algorithm with write mode.

field compute_millis: int
field configuration: dict[str, Any]
field node_count: int
field node_properties_written: int
field total_spread: float
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.ClosenessEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • use_wasserman_faust (bool | None, default=None) – Use the improved Wasserman-Faust formula for closeness computation.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ClosenessMutateResult

Runs the Closeness Centrality algorithm and stores the results in the graph catalog as a new node property.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessMutateResult

abstract stats(G: GraphV2, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ClosenessStatsResult

Runs the Closeness Centrality algorithm and returns result statistics without storing the results.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:
  • G (GraphV2) – Graph object to use

  • use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

ClosenessStatsResult

abstract stream(G: GraphV2, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the Closeness Centrality algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with nodeId and score columns containing closeness centrality results. Each row represents a node with its corresponding closeness centrality score.

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) ClosenessWriteResult

Runs the Closeness Centrality algorithm and stores the result in the Neo4j database as a new node property.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • use_wasserman_faust (bool | None, default=None) – Use the improved Wasserman-Faust formula for closeness computation.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ClosenessWriteResult – Algorithm metrics and statistics including the number of properties written

Return type:

ClosenessWriteResult

class graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of a Harmonic Closeness Centrality algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ClosenessHarmonicMutateResult

Runs the Harmonic Centrality algorithm and stores the results in the graph catalog as a new node property.

Harmonic centrality (also known as valued centrality) is a variant of closeness centrality, that was invented to solve the problem the original formula had when dealing with unconnected graphs.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessHarmonicMutateResult

abstract stats(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ClosenessHarmonicStatsResult

Runs the Harmonic Centrality algorithm and returns result statistics without storing the results.

Harmonic centrality was proposed by Marchiori and Latora while trying to come up with a sensible notion of “average shortest path”. They suggested a different way of calculating the average distance to that used in the Closeness Centrality algorithm. Rather than summing the distances of a node to all other nodes, the harmonic centrality algorithm sums the inverse of those distances. This enables it deal with infinite values.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

ClosenessHarmonicStatsResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the Harmonic Closeness Centrality algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with the algorithm results containing nodeId and score columns

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) ClosenessHarmonicWriteResult

Runs the Harmonic Centrality algorithm and stores the result in the Neo4j database as a new node property.

Harmonic centrality was proposed by Marchiori and Latora while trying to come up with a sensible notion of “average shortest path”. They suggested a different way of calculating the average distance to that used in the Closeness Centrality algorithm. Rather than summing the distances of a node to all other nodes, the harmonic centrality algorithm sums the inverse of those distances. This enables it deal with infinite values.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • ClosenessHarmonicWriteResult – Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessHarmonicWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicMutateResult

Result of running Harmonic Closeness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicStatsResult

Result of running Harmonic Closeness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicWriteResult

Result of running Harmonic Closeness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessMutateResult

Result of running Closeness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessStatsResult

Result of running Closeness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessWriteResult

Result of running Closeness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.DegreeEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) DegreeMutateResult

Runs the Degree Centrality algorithm and stores the results in the graph catalog as a new node property.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

DegreeMutateResult

abstract stats(G: GraphV2, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) DegreeStatsResult

Runs the Degree Centrality algorithm and returns result statistics without storing the results.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm statistics including the centrality distribution

Return type:

DegreeStatsResult

abstract stream(G: GraphV2, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) DataFrame

Executes the Degree Centrality algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’. ‘NATURAL’ (default) respects the direction of relationships as they are stored in the graph. ‘REVERSE’ treats each relationship as if it were directed in the opposite direction. ‘UNDIRECTED’ treats all relationships as undirected, effectively counting both directions.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

DataFrame with nodeId and score columns containing degree centrality results. Each row represents a node with its corresponding degree centrality score.

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, write_concurrency: int | None = None) DegreeWriteResult

Runs the Degree Centrality algorithm and stores the result in the Neo4j database as a new node property.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • DegreeWriteResult – Algorithm metrics and statistics including the centrality distribution and write timing

Return type:

DegreeWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.DegreeMutateResult

Result of running Degree Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.DegreeStatsResult

Result of running Degree Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field post_processing_millis: int
field pre_processing_millis: int
pydantic model graphdatascience.procedure_surface.api.centrality.DegreeWriteResult

Result of running Degree Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.EigenvectorEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • max_iterations (int) – Maximum number of iterations to run.

  • tolerance (float) – Minimum change in scores between iterations.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) EigenvectorMutateResult

Runs the Eigenvector Centrality algorithm and stores the results in the graph catalog as a new node property.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • max_iterations (int) – Maximum number of iterations to run.

  • tolerance (float) – Minimum change in scores between iterations.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

EigenvectorMutateResult

abstract stats(G: GraphV2, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) EigenvectorStatsResult

Runs the Eigenvector Centrality algorithm and returns result statistics without storing the results.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:
  • G (GraphV2) – Graph object to use

  • max_iterations (int) – Maximum number of iterations to run.

  • tolerance (float) – Minimum change in scores between iterations.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

EigenvectorStatsResult

abstract stream(G: GraphV2, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame

Executes the Eigenvector Centrality algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • max_iterations (int) – Maximum number of iterations to run.

  • tolerance (float) – Minimum change in scores between iterations.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with the algorithm results containing nodeId and score columns

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) EigenvectorWriteResult

Runs the Eigenvector Centrality algorithm and stores the result in the Neo4j database as a new node property.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • max_iterations (int) – Maximum number of iterations to run.

  • tolerance (float) – Minimum change in scores between iterations.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • EigenvectorWriteResult – Algorithm metrics and statistics including the centrality distribution

Return type:

EigenvectorWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorMutateResult

Result of running Eigenvector Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorStatsResult

Result of running Eigenvector Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorWriteResult

Result of running Eigenvector Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int
class graphdatascience.procedure_surface.api.centrality.PageRankEndpoints
abstract estimate(G: GraphV2 | dict[str, Any], damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) EstimationResult

Estimate the memory consumption of an algorithm run.

Parameters:
  • G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.

  • damping_factor (float) – The damping factor controls the probability of a random jump to a random node

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • concurrency (int | None) – Number of concurrent threads to use.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (Any | None, default=None) – The source nodes for personalized PageRank

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) PageRankMutateResult

Runs the PageRank algorithm and stores the results in the graph catalog as a new node property.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:
  • G (GraphV2) – Graph object to use

  • mutate_property (str) – Name of the node property to store the results in.

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm metrics and statistics

Return type:

PageRankMutateResult

abstract stats(G: GraphV2, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) PageRankStatsResult

Runs the PageRank algorithm and returns result statistics without storing the results.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:
  • G (GraphV2) – Graph object to use

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) –

    node ids to use as starting points. Can be:

    • single node id (e.g., 42)

    • list of node id (e.g., [42, 43, 44])

    • list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm statistics

Return type:

PageRankStatsResult

abstract stream(G: GraphV2, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) DataFrame

Executes the PageRank algorithm and returns a stream of results.

Parameters:
  • G (GraphV2) – Graph object to use

  • damping_factor (float) – The damping factor controls the probability of a random jump to a random node

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (Any | None, default=None) – The source nodes for personalized PageRank

Returns:

DataFrame with node IDs and their PageRank scores

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None, write_concurrency: int | None = None) PageRankWriteResult

Runs the PageRank algorithm and stores the result in the Neo4j database as a new node property.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:
  • G (GraphV2) – Graph object to use

  • write_property (str) – Name of the node property to store the results in.

  • damping_factor (float) – Probability of a jump to a random node.

  • tolerance (float) – Minimum change in scores between iterations.

  • max_iterations (int) – Maximum number of iterations to run.

  • scaler (str | dict[str, str | int | float] | ScalerConfig) –

    The scaler to use. Can be:

    • A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)

    • A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})

  • relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.

  • node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.

  • sudo (bool) – Disable the memory guard.

  • log_progress (bool) – Display progress logging.

  • username (str | None) – As an administrator, impersonate a different user for accessing their graphs.

  • concurrency (int | None) – Number of concurrent threads to use.

  • job_id (str | None) – Identifier for the computation.

  • relationship_weight_property (str | None) – Name of the property to be used as weights.

  • source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

  • write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns

  • -------

  • PageRankWriteResult – Algorithm metrics and statistics

Return type:

PageRankWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.PageRankMutateResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field mutate_millis: int
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.PageRankStatsResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
pydantic model graphdatascience.procedure_surface.api.centrality.PageRankWriteResult
field centrality_distribution: dict[str, Any]
field compute_millis: int
field configuration: dict[str, Any]
field did_converge: bool
field node_properties_written: int
field post_processing_millis: int
field pre_processing_millis: int
field ran_iterations: int
field write_millis: int