Centrality Algorithms¶

class graphdatascience.procedure_surface.api.centrality.ArticleRankEndpoints¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
damping_factor (float) – The damping factor controls the probability of a random jump to a random node
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → ArticleRankMutateResult¶

Runs the Article Rank algorithm and stores the results in the graph catalog as a new node property.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm metrics and statistics

Return type:

ArticleRankMutateResult

abstract stats(G: GraphV2, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → ArticleRankStatsResult¶

Runs the Article Rank algorithm and returns result statistics without storing the results.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:

G (GraphV2) – Graph object to use
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm statistics

Return type:

ArticleRankStatsResult

abstract stream(G: GraphV2, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → DataFrame¶

Executes the ArticleRank algorithm and returns the results as a stream.

Parameters:

G (GraphV2) – Graph object to use
damping_factor (float) – The damping factor controls the probability of a random jump to a random node
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

DataFrame with node IDs and their ArticleRank scores

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, *, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None, write_concurrency: int | None = None) → ArticleRankWriteResult¶

Runs the Article Rank algorithm and stores the result in the Neo4j database as a new node property.

ArticleRank is a variant of the Page Rank algorithm, which measures the transitive influence of nodes. Page Rank follows the assumption that relationships originating from low-degree nodes have a higher influence than relationships from high-degree nodes. Article Rank lowers the influence of low-degree nodes by lowering the scores being sent to their neighbors in each iteration.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ArticleRankWriteResult – Algorithm metrics and statistics

Return type:

ArticleRankWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankMutateResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankStatsResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ArticleRankWriteResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.ArticulationPointsEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation including memory requirements

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ArticulationPointsMutateResult¶

Runs the Articulation Points algorithm and stores the results in the graph catalog as a new node property.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the count of articulation points found

Return type:

ArticulationPointsMutateResult

abstract stats(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ArticulationPointsStatsResult¶

Runs the Articulation Points algorithm and returns result statistics without storing the results.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:

G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the count of articulation points found

Return type:

ArticulationPointsStatsResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the ArticulationPoints algorithm and returns results as a stream.

Parameters:

G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

A DataFrame containing articulation points with columns: - nodeId: The ID of the articulation point - resultingComponents: Information about resulting components

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) → ArticulationPointsWriteResult¶

Runs the Articulation Points algorithm and stores the result in the Neo4j database as a new node property.

Given a graph, an articulation point is a node whose removal increases the number of connected components in the graph.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ArticulationPointsWriteResult – Algorithm metrics and statistics including the count of articulation points found

Return type:

ArticulationPointsWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsMutateResult¶

Result of running ArticulationPoints algorithm with mutate mode.

field articulation_point_count: int¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_properties_written: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsStatsResult¶

Result of running ArticulationPoints algorithm with stats mode.

field articulation_point_count: int¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

pydantic model graphdatascience.procedure_surface.api.centrality.ArticulationPointsWriteResult¶

Result of running ArticulationPoints algorithm with write mode.

field articulation_point_count: int¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_properties_written: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.BetweennessEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
sampling_size (int | None, default=None) – The number of nodes to use for sampling.
sampling_seed (int | None, default=None) – The seed value for sampling randomization
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → BetweennessMutateResult¶

Runs the Betweenness Centrality algorithm and stores the results in the graph catalog as a new node property.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.
sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics including centrality distribution

Return type:

BetweennessMutateResult

abstract stats(G: GraphV2, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → BetweennessStatsResult¶

Runs the Betweenness Centrality algorithm and returns result statistics without storing the results.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:

G (GraphV2) – Graph object to use
sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.
sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm statistics including centrality distribution

Return type:

BetweennessStatsResult

abstract stream(G: GraphV2, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → DataFrame¶

Executes the Betweenness Centrality algorithm and returns the results as a stream.

Parameters:

G (GraphV2) – Graph object to use
sampling_size (int | None, default=None) – The number of nodes to use for sampling.
sampling_seed (int | None, default=None) – The seed value for sampling randomization
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

DataFrame with nodeId and score columns containing betweenness centrality results

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, sampling_size: int | None = None, sampling_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, write_concurrency: int | None = None) → BetweennessWriteResult¶

Runs the Betweenness Centrality algorithm and stores the result in the Neo4j database as a new node property.

Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph. Each node receives a score, based on the number of shortest paths that pass through the node.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
sampling_size (int | None, default=None) – Number of source nodes to consider for computing centrality scores.
sampling_seed (int | None, default=None) – Seed value for the random number generator that selects source nodes.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
BetweennessWriteResult – Algorithm metrics and statistics including centrality distribution

Return type:

BetweennessWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessMutateResult¶

Result of running Betweenness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessStatsResult¶

Result of running Betweenness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.BetweennessWriteResult¶

Result of running Betweenness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.BridgesEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the Bridges algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with ‘from’, ‘to’ and ‘remainingSizes’ columns. The remainingSizes column contains the sizes of the remaining connected components after removing the bridge relationship.

Return type:

DataFrame

class graphdatascience.procedure_surface.api.centrality.CelfEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], seed_set_size: int, propagation_probability: float = 0.1, monte_carlo_simulations: int = 100, random_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
seed_set_size (int) – The number of nodes to select as the seed set for influence maximization.
propagation_probability (float) – The probability that influence spreads from one node to another.
monte_carlo_simulations (int) – The number of Monte-Carlo simulations.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation including memory requirements

Return type:

EstimationResult

abstract mutate(G: GraphV2, seed_set_size: int, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → CelfMutateResult¶

Runs the CELF algorithm and stores the results in the graph catalog as a new node property.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:

G (GraphV2) – Graph object to use
seed_set_size (int) – The number of nodes to select as the seed set for influence maximization
mutate_property (str) – Name of the node property to store the results in.
propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.
monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the total influence spread

Return type:

CelfMutateResult

abstract stats(G: GraphV2, seed_set_size: int, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → CelfStatsResult¶

Runs the CELF algorithm and returns result statistics without storing the results.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:

G (GraphV2) – Graph object to use
seed_set_size (int) – The number of nodes to select as the seed set for influence maximization
propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.
monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the total influence spread

Return type:

CelfStatsResult

abstract stream(G: GraphV2, seed_set_size: int, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], monte_carlo_simulations: int = 100, propagation_probability: float = 0.1, random_seed: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the CELF algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
seed_set_size (int) – The number of nodes to select as the seed set for influence maximization
propagation_probability (float | None, default=None) – The probability that influence spreads from one node to another.
monte_carlo_simulations (int | None, default=None) – The number of Monte-Carlo simulations.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with nodeId and spread columns containing CELF results. Each row represents a selected node with its corresponding influence spread value.

Return type:

DataFrame

abstract write(G: GraphV2, seed_set_size: int, write_property: str, propagation_probability: float = 0.1, monte_carlo_simulations: int = 100, random_seed: int | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) → CelfWriteResult¶

Runs the CELF algorithm and stores the result in the Neo4j database as a new node property.

The influence maximization problem asks for a set of k nodes that maximize the expected spread of influence in the network.

Parameters:

G (GraphV2) – Graph object to use
seed_set_size (int) – The number of nodes to select as the seed set for influence maximization
write_property (str) – Name of the node property to store the results in.
propagation_probability (float | None, default=None) – Probability of a node being activated by an active neighbour node.
monte_carlo_simulations (int | None, default=None) – Number of Monte-Carlo simulations.
random_seed (int | None) – Seed for random number generation to ensure reproducible results.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
CelfWriteResult – Algorithm metrics and statistics including the total influence spread and write timing

Return type:

CelfWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.CelfMutateResult¶

Result of running CELF algorithm with mutate mode.

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_count: int¶

field node_properties_written: int¶

field total_spread: float¶

pydantic model graphdatascience.procedure_surface.api.centrality.CelfStatsResult¶

Result of running CELF algorithm with stats mode.

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_count: int¶

field total_spread: float¶

pydantic model graphdatascience.procedure_surface.api.centrality.CelfWriteResult¶

Result of running CELF algorithm with write mode.

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_count: int¶

field node_properties_written: int¶

field total_spread: float¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.ClosenessEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
use_wasserman_faust (bool | None, default=None) – Use the improved Wasserman-Faust formula for closeness computation.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ClosenessMutateResult¶

Runs the Closeness Centrality algorithm and stores the results in the graph catalog as a new node property.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessMutateResult

abstract stats(G: GraphV2, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ClosenessStatsResult¶

Runs the Closeness Centrality algorithm and returns result statistics without storing the results.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:

G (GraphV2) – Graph object to use
use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

ClosenessStatsResult

abstract stream(G: GraphV2, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the Closeness Centrality algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
use_wasserman_faust (bool) – Use the improved Wasserman-Faust formula for closeness computation.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with nodeId and score columns containing closeness centrality results. Each row represents a node with its corresponding closeness centrality score.

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, use_wasserman_faust: bool = False, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) → ClosenessWriteResult¶

Runs the Closeness Centrality algorithm and stores the result in the Neo4j database as a new node property.

Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
use_wasserman_faust (bool | None, default=None) – Use the improved Wasserman-Faust formula for closeness computation.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ClosenessWriteResult – Algorithm metrics and statistics including the number of properties written

Return type:

ClosenessWriteResult

class graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) → EstimationResult¶

Estimate the memory consumption of a Harmonic Closeness Centrality algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ClosenessHarmonicMutateResult¶

Runs the Harmonic Centrality algorithm and stores the results in the graph catalog as a new node property.

Harmonic centrality (also known as valued centrality) is a variant of closeness centrality, that was invented to solve the problem the original formula had when dealing with unconnected graphs.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessHarmonicMutateResult

abstract stats(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → ClosenessHarmonicStatsResult¶

Runs the Harmonic Centrality algorithm and returns result statistics without storing the results.

Harmonic centrality was proposed by Marchiori and Latora while trying to come up with a sensible notion of “average shortest path”. They suggested a different way of calculating the average distance to that used in the Closeness Centrality algorithm. Rather than summing the distances of a node to all other nodes, the harmonic centrality algorithm sums the inverse of those distances. This enables it deal with infinite values.

Parameters:

G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

ClosenessHarmonicStatsResult

abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the Harmonic Closeness Centrality algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with the algorithm results containing nodeId and score columns

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) → ClosenessHarmonicWriteResult¶

Runs the Harmonic Centrality algorithm and stores the result in the Neo4j database as a new node property.

Harmonic centrality was proposed by Marchiori and Latora while trying to come up with a sensible notion of “average shortest path”. They suggested a different way of calculating the average distance to that used in the Closeness Centrality algorithm. Rather than summing the distances of a node to all other nodes, the harmonic centrality algorithm sums the inverse of those distances. This enables it deal with infinite values.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ClosenessHarmonicWriteResult – Algorithm metrics and statistics including the centrality distribution

Return type:

ClosenessHarmonicWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicMutateResult¶

Result of running Harmonic Closeness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicStatsResult¶

Result of running Harmonic Closeness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessHarmonicWriteResult¶

Result of running Harmonic Closeness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field write_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessMutateResult¶

Result of running Closeness Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessStatsResult¶

Result of running Closeness Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.ClosenessWriteResult¶

Result of running Closeness Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.DegreeEndpoints¶

abstract estimate(G: GraphV2 | dict[str, Any], orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None, relationship_weight_property: str | None = None) → EstimationResult¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → DegreeMutateResult¶

Runs the Degree Centrality algorithm and stores the results in the graph catalog as a new node property.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

DegreeMutateResult

abstract stats(G: GraphV2, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → DegreeStatsResult¶

Runs the Degree Centrality algorithm and returns result statistics without storing the results.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:

G (GraphV2) – Graph object to use
orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

Algorithm statistics including the centrality distribution

Return type:

DegreeStatsResult

abstract stream(G: GraphV2, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None) → DataFrame¶

Executes the Degree Centrality algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’. ‘NATURAL’ (default) respects the direction of relationships as they are stored in the graph. ‘REVERSE’ treats each relationship as if it were directed in the opposite direction. ‘UNDIRECTED’ treats all relationships as undirected, effectively counting both directions.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.

Returns:

DataFrame with nodeId and score columns containing degree centrality results. Each row represents a node with its corresponding degree centrality score.

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, orientation: str = 'NATURAL', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, write_concurrency: int | None = None) → DegreeWriteResult¶

Runs the Degree Centrality algorithm and stores the result in the Neo4j database as a new node property.

The Degree Centrality algorithm can be used to find popular nodes within a graph. The degree centrality measures the number of incoming or outgoing (or both) relationships from a node, which can be defined by the orientation of a relationship projection. It can be applied to either weighted or unweighted graphs. In the weighted case the algorithm computes the sum of all positive weights of adjacent relationships of a node, for each node in the graph.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
orientation (str | None) – The orientation of relationships to consider. Can be ‘NATURAL’, ‘REVERSE’, or ‘UNDIRECTED’.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
DegreeWriteResult – Algorithm metrics and statistics including the centrality distribution and write timing

Return type:

DegreeWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.DegreeMutateResult¶

Result of running Degree Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.DegreeStatsResult¶

Result of running Degree Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.DegreeWriteResult¶

Result of running Degree Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.EigenvectorEndpoints¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
max_iterations (int) – Maximum number of iterations to run.
tolerance (float) – Minimum change in scores between iterations.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.

Returns:

An object containing the result of the estimation

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → EigenvectorMutateResult¶

Runs the Eigenvector Centrality algorithm and stores the results in the graph catalog as a new node property.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
max_iterations (int) – Maximum number of iterations to run.
tolerance (float) – Minimum change in scores between iterations.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm metrics and statistics including the centrality distribution

Return type:

EigenvectorMutateResult

abstract stats(G: GraphV2, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → EigenvectorStatsResult¶

Runs the Eigenvector Centrality algorithm and returns result statistics without storing the results.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:

G (GraphV2) – Graph object to use
max_iterations (int) – Maximum number of iterations to run.
tolerance (float) – Minimum change in scores between iterations.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

Algorithm statistics including the centrality distribution

Return type:

EigenvectorStatsResult

abstract stream(G: GraphV2, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) → DataFrame¶

Executes the Eigenvector Centrality algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
max_iterations (int) – Maximum number of iterations to run.
tolerance (float) – Minimum change in scores between iterations.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.

Returns:

DataFrame with the algorithm results containing nodeId and score columns

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, max_iterations: int = 20, tolerance: float = 1e-07, source_nodes: int | list[int] | None = None, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) → EigenvectorWriteResult¶

Runs the Eigenvector Centrality algorithm and stores the result in the Neo4j database as a new node property.

Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. The algorithm computes the eigenvector associated with the largest absolute eigenvalue.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
max_iterations (int) – Maximum number of iterations to run.
tolerance (float) – Minimum change in scores between iterations.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44])
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
EigenvectorWriteResult – Algorithm metrics and statistics including the centrality distribution

Return type:

EigenvectorWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorMutateResult¶

Result of running Eigenvector Centrality algorithm with mutate mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorStatsResult¶

Result of running Eigenvector Centrality algorithm with stats mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.EigenvectorWriteResult¶

Result of running Eigenvector Centrality algorithm with write mode.

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

field write_millis: int¶

class graphdatascience.procedure_surface.api.centrality.PageRankEndpoints¶

Estimate the memory consumption of an algorithm run.

Parameters:

G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
damping_factor (float) – The damping factor controls the probability of a random jump to a random node
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (Any | None, default=None) – The source nodes for personalized PageRank

Returns:

Memory estimation details

Return type:

EstimationResult

abstract mutate(G: GraphV2, mutate_property: str, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → PageRankMutateResult¶

Runs the PageRank algorithm and stores the results in the graph catalog as a new node property.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:

G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm metrics and statistics

Return type:

PageRankMutateResult

abstract stats(G: GraphV2, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → PageRankStatsResult¶

Runs the PageRank algorithm and returns result statistics without storing the results.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:

G (GraphV2) – Graph object to use
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) –
node ids to use as starting points. Can be:
- single node id (e.g., 42)
- list of node id (e.g., [42, 43, 44])
- list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])

Returns:

Algorithm statistics

Return type:

PageRankStatsResult

abstract stream(G: GraphV2, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None) → DataFrame¶

Executes the PageRank algorithm and returns a stream of results.

Parameters:

G (GraphV2) – Graph object to use
damping_factor (float) – The damping factor controls the probability of a random jump to a random node
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (Any | None, default=None) – The source nodes for personalized PageRank

Returns:

DataFrame with node IDs and their PageRank scores

Return type:

DataFrame

abstract write(G: GraphV2, write_property: str, damping_factor: float = 0.85, tolerance: float = 1e-07, max_iterations: int = 20, scaler: str | dict[str, str | int | float] | ScalerConfig = 'NONE', relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, relationship_weight_property: str | None = None, source_nodes: int | list[int] | list[tuple[int, float]] | None = None, write_concurrency: int | None = None) → PageRankWriteResult¶

Runs the PageRank algorithm and stores the result in the Neo4j database as a new node property.

The PageRank algorithm measures the importance of each node within the graph, based on the number of incoming relationships and the importance of the corresponding source nodes. The underlying assumption roughly speaking is that a page is only as important as the pages that link to it.

Parameters:

G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
damping_factor (float) – Probability of a jump to a random node.
tolerance (float) – Minimum change in scores between iterations.
max_iterations (int) – Maximum number of iterations to run.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
- A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
- A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
- - A ScalerConfig instance
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
relationship_weight_property (str | None) – Name of the property to be used as weights.
source_nodes (int | list[int] | list[tuple[int, float]] | None, default=None) – node ids to use as starting points. Can be: - single node id (e.g., 42) - list of node id (e.g., [42, 43, 44]) - list of tuples to associate each node with a bias > 0 (e.g., [(42, 0.5), (43, 1.0)])
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
PageRankWriteResult – Algorithm metrics and statistics

Return type:

PageRankWriteResult

pydantic model graphdatascience.procedure_surface.api.centrality.PageRankMutateResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field mutate_millis: int¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.PageRankStatsResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

pydantic model graphdatascience.procedure_surface.api.centrality.PageRankWriteResult¶

field centrality_distribution: dict[str, Any]¶

field compute_millis: int¶

field configuration: dict[str, Any]¶

field did_converge: bool¶

field node_properties_written: int¶

field post_processing_millis: int¶

field pre_processing_millis: int¶

field ran_iterations: int¶

field write_millis: int¶