Catalog Endpoints¶
- enum graphdatascience.procedure_surface.api.catalog.Aggregation(value)¶
- Member Type:
Valid values are as follows:
- NONE = <Aggregation.NONE: 'NONE'>¶
- SINGLE = <Aggregation.SINGLE: 'SINGLE'>¶
- SUM = <Aggregation.SUM: 'SUM'>¶
- MIN = <Aggregation.MIN: 'MIN'>¶
- MAX = <Aggregation.MAX: 'MAX'>¶
- COUNT = <Aggregation.COUNT: 'COUNT'>¶
The
Enumand its members also have the following methods:- __new__(value)¶
- class graphdatascience.procedure_surface.api.catalog.CatalogEndpoints¶
- abstract construct(graph_name: str, nodes: DataFrame | list[DataFrame], relationships: DataFrame | list[DataFrame] | None = None, concurrency: int | None = None, undirected_relationship_types: list[str] | None = None) GraphV2¶
Construct a graph from a list of node and relationship dataframes.
- Parameters:
graph_name (str) – Name of the graph to construct
nodes (DataFrame | list[DataFrame]) –
Node dataframes. A dataframe should follow the schema:
nodeId to identify uniquely the node overall dataframes
labels to specify the labels of the node as a list of strings (optional)
other columns are treated as node properties
relationships (DataFrame | list[DataFrame] | None) –
Relationship dataframes. A dataframe should follow the schema:
sourceNodeId to identify the start node of the relationship
targetNodeId to identify the end node of the relationship
relationshipType to specify the type of the relationship (optional)
other columns are treated as relationship properties
concurrency (int | None) – Number of concurrent threads to use.
undirected_relationship_types (list[str] | None) – List of relationship types to treat as undirected.
- Returns:
Constructed graph object.
- Return type:
- property datasets: DatasetEndpoints¶
Endpoints for loading predefined datasets into the graph catalog.
- abstract drop(G: GraphV2 | str, fail_if_missing: bool = True) GraphInfo | None¶
Drop a graph from the graph catalog.
- abstract filter(G: GraphV2, graph_name: str, node_filter: str, relationship_filter: str, concurrency: int | None = None, job_id: str | None = None) GraphWithFilterResult¶
Create a subgraph of a graph based on a filter expression.
- Parameters:
G (GraphV2) – Graph object to use
(str) (relationship_filter) – Name of subgraph to create
(str) – Filter expression for nodes
(str) – Filter expression for relationships
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
graph_name (str)
node_filter (str)
relationship_filter (str)
- Returns:
tuple of the filtered graph object and the information like graph name, node count, relationship count, etc.
- Return type:
- abstract generate(graph_name: str, node_count: int, average_degree: float, *, relationship_distribution: str | None = None, relationship_seed: int | None = None, relationship_property: RelationshipPropertySpec | None = None, orientation: str | None = None, allow_self_loops: bool | None = None, read_concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) GraphWithGenerationStats¶
Generates a random graph and store it in the graph catalog.
- Parameters:
graph_name (str) – Name of the generated graph.
node_count (int) – The number of nodes in the generated graph
average_degree (float) – The average out-degree of the generated nodes
relationship_distribution (str | None, default=None) – Determines the relationship distribution strategy.
relationship_seed (int | None, default=None) – Seed value for generating deterministic relationships.
relationship_property (RelationshipPropertySpec | None, default=None) – Configure generated relationship properties.
orientation (str | None, default=None) – Specifies the orientation of the generated relationships.
allow_self_loops (bool | None, default=None) – Whether nodes in the graph can have relationships where start and end nodes are the same.
read_concurrency (int | None, default=None) – Number of concurrent threads/processes to use during graph generation.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
tuple of the generated graph object and the result object containing stats about the generation.
- Return type:
- abstract list(G: GraphV2 | str | None = None) list[GraphInfoWithDegrees]¶
List graphs in the graph catalog.
- abstract property node_labels: NodeLabelEndpoints¶
Endpoints for node label operations.
- abstract property node_properties: NodePropertiesEndpoints¶
Endpoints for node label operations.
- abstract property relationships: RelationshipsEndpoints¶
Endpoints for relationship operations.
- abstract property sample: GraphSamplingEndpoints¶
Endpoints for graph sampling.
- class graphdatascience.procedure_surface.api.catalog.GraphBackend¶
- pydantic model graphdatascience.procedure_surface.api.catalog.GraphFilterResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.GraphGenerationStats¶
- Validators:
check_empty_property»relationship_property
- field relationship_property: RelationshipPropertySpec | None¶
- Validated by:
check_empty_property
- pydantic model graphdatascience.procedure_surface.api.catalog.GraphInfo¶
- Validators:
strip_timezone»creation_timestrip_timezone»modification_time
- field creation_time: datetime¶
- Validated by:
strip_timezone
- field modification_time: datetime¶
- Validated by:
strip_timezone
- pydantic model graphdatascience.procedure_surface.api.catalog.GraphInfoWithDegrees¶
- Validators:
- class graphdatascience.procedure_surface.api.catalog.GraphSamplingEndpoints¶
- abstract cnarw(G: GraphV2, graph_name: str, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) GraphWithSamplingResult¶
Common Neighbour Aware Random Walk (CNARW) samples the graph by taking random walks from a set of start nodes
CNARW is a graph sampling technique that involves optimizing the selection of the next-hop node. It takes into account the number of common neighbours between the current node and the next-hop candidates. On each step of a random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The resulting subgraph is stored as a new graph in the Graph Catalog.
- Parameters:
G (GraphV2) – Graph object to use
graph_name (str) – The name of the new graph that is stored in the graph catalog.
start_nodes (list of int, optional) –
- IDs of the initial set of nodes in the original graph from which the sampling random walks will start.
By default, a single node is chosen uniformly at random.
restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.
sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.
node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
tuple of the graph object and the result of the Common Neighbour Aware Random Walk (CNARW), including the dimensions of the sampled graph.
- Return type:
- abstract estimate(G: GraphV2, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult¶
Estimate the memory consumption of a CNARW algorithm run.
- Parameters:
G (GraphV2) – Graph object to use
start_nodes (list of int, optional) – IDs of the initial set of nodes in the original graph from which the sampling random walks will start. By default, a single node is chosen uniformly at random.
restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.
sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.
node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
- Returns:
Memory estimation details
- Return type:
- abstract rwr(G: GraphV2, graph_name: str, start_nodes: list[int] | None = None, restart_probability: float = 0.1, sampling_ratio: float = 0.15, node_label_stratification: bool = False, relationship_weight_property: str | None = None, relationship_types: list[str] = ['*'], node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) GraphWithSamplingResult¶
Random walk with restarts (RWR) samples the graph by taking random walks from a set of start nodes.
On each step of a random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The resulting subgraph is stored as a new graph in the Graph Catalog.
- Parameters:
G (GraphV2) – Graph object to use
graph_name (str) – The name of the new graph that is stored in the graph catalog.
start_nodes (list of int, optional) – IDs of the initial set of nodes in the original graph from which the sampling random walks will start. By default, a single node is chosen uniformly at random.
restart_probability (float, optional) – The probability that a sampling random walk restarts from one of the start nodes. Default is 0.1.
sampling_ratio (float, optional) – The fraction of nodes in the original graph to be sampled. Default is 0.15.
node_label_stratification (bool, optional) – If true, preserves the node label distribution of the original graph. Default is False.
relationship_weight_property (str | None) – Name of the property to be used as weights.
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
tuple of the graph object and the result of the Random Walk with Restart (RWR), including the dimensions of the sampled graph.
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.GraphSamplingResult¶
- class graphdatascience.procedure_surface.api.catalog.GraphWithFilterResult¶
GraphWithFilterResult(graph, result)
- static __new__(_cls, graph: GraphV2, result: GraphFilterResult)¶
Create new instance of GraphWithFilterResult(graph, result)
- Parameters:
graph (GraphV2)
result (GraphFilterResult)
- result: GraphFilterResult¶
Alias for field number 1
- class graphdatascience.procedure_surface.api.catalog.GraphWithGenerationStats¶
GraphWithGenerationStats(graph, result)
- static __new__(_cls, graph: GraphV2, result: GraphGenerationStats)¶
Create new instance of GraphWithGenerationStats(graph, result)
- Parameters:
graph (GraphV2)
result (GraphGenerationStats)
- result: GraphGenerationStats¶
Alias for field number 1
- class graphdatascience.procedure_surface.api.catalog.GraphWithSamplingResult¶
GraphWithSamplingResult(graph, result)
- static __new__(_cls, graph: GraphV2, result: GraphSamplingResult)¶
Create new instance of GraphWithSamplingResult(graph, result)
- Parameters:
graph (GraphV2)
result (GraphSamplingResult)
- result: GraphSamplingResult¶
Alias for field number 1
- class graphdatascience.procedure_surface.api.catalog.NodeLabelEndpoints¶
- abstract mutate(G: GraphV2, node_label: str, *, node_filter: str, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, write_concurrency: int | None = None, job_id: str | None = None) NodeLabelMutateResult¶
Attaches the specified node label to the filtered nodes in the graph.
- Parameters:
G (GraphV2) – Graph object to use
node_label (str) – The node label to write back.
node_filter (str) – A Cypher predicate for filtering nodes in the input graph.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- abstract write(G: GraphV2, node_label: str, *, node_filter: str, sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, write_concurrency: int | None = None, job_id: str | None = None) NodeLabelWriteResult¶
Writes the specified node label to the filtered nodes in the database.
- Parameters:
G (GraphV2) – Graph object to use
node_label (str) – The node label to write back.
node_filter (str) – A Cypher predicate for filtering nodes in the input graph.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelMutateResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelPersistenceResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.NodeLabelWriteResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.NodePropertiesDropResult¶
- class graphdatascience.procedure_surface.api.catalog.NodePropertiesEndpoints¶
- abstract drop(G: GraphV2, node_properties: list[str], *, fail_if_missing: bool | None = None, concurrency: int | None = None, username: str | None = None) NodePropertiesDropResult¶
Drops the specified node properties from the graph.
- Parameters:
- Returns:
Execution metrics and statistics
- Return type:
- abstract stream(G: GraphV2, node_properties: str | list[str], *, list_node_labels: bool | None = None, node_labels: list[str] = ['*'], concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None, db_node_properties: list[str] | None = None) DataFrame¶
Streams the specified node properties from the graph.
- Parameters:
G (GraphV2) – Graph object to use
node_properties (str | list[str]) – The node properties to stream
list_node_labels (boolean | None, default=None) – Whether to include node labels in the stream
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
job_id (str | None) – Identifier for the computation.
db_node_properties (list[str] | None, default=None) – Retrieves additional node properties from the database and attaches them to the stream.
- Returns:
The streamed node properties
- Return type:
DataFrame
- abstract write(G: GraphV2, node_properties: str | list[str] | dict[str, str], *, node_labels: list[str] = ['*'], concurrency: int | None = None, write_concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) NodePropertiesWriteResult¶
Writes the specified node properties from the graph to the database.
- Parameters:
G (GraphV2) – Graph object to use
node_properties (str | list[str] | dict[str, str]) –
Node properties to write. Can be:
A string representing a single property name.
A list of strings representing multiple property names.
A dictionary mapping from property names in the GDS graph to property names in the database.
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.NodePropertiesWriteResult¶
- class graphdatascience.procedure_surface.api.catalog.NodePropertySpec¶
NodePropertySpec(node_properties: ‘str | list[str] | dict[str, str]’) -> ‘None’
- pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipPropertySpec¶
-
- static fixed(name: str, value: float) RelationshipPropertySpec¶
- Parameters:
- Return type:
- static random(name: str, min: float, max: float) RelationshipPropertySpec¶
- Parameters:
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsDropResult¶
- class graphdatascience.procedure_surface.api.catalog.RelationshipsEndpoints¶
- abstract collapse_path(G: GraphV2, path_templates: list[list[str]], mutate_relationship_type: str, *, allow_self_loops: bool = False, concurrency: int | None = None, job_id: str | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) CollapsePathResult¶
Collapse each existing path in the graph into a single relationship.
- Parameters:
G (GraphV2) – Graph object to use
path_templates (list[list[str]]) – A path template is an ordered list of relationship types used for the traversal. The same relationship type can be added multiple times, in order to traverse them as indicated. And, you may specify several path templates to process in one go.
mutate_relationship_type (str) – Name of the relationship type to store the results in.
allow_self_loops (bool, default=False) – Whether nodes in the graph can have relationships where start and end nodes are the same.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
CollapsePathResult
- Return type:
meta data about the generated relationships.
- abstract drop(G: GraphV2, relationship_type: str, *, fail_if_missing: bool = True) RelationshipsDropResult¶
Drops all relationships of the specified relationship type, including all their properties, from the graph.
- Parameters:
- Returns:
Execution metrics and statistics
- Return type:
- abstract index_inverse(G: GraphV2, relationship_types: list[str], *, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsInverseIndexResult¶
Creates an index of the specified relationships indexing the reverse direction of each relationship. This index can be used by some algorithm to speed up the computation.
- Parameters:
G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
concurrency (int | None) – Number of concurrent threads to use.
sudo (bool = False,) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- abstract stream(G: GraphV2, relationship_types: list[str] = ['*'], relationship_properties: list[str] | None = None, *, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None) DataFrame¶
Streams all relationships of the specified types with the specified properties.
- Parameters:
G (GraphV2) – Graph object to use
relationship_types (list[str]) – Filter the graph using the given relationship types. Relationships with any of the given types will be included.
relationship_properties (list[str] | None, default = None) – The relationship properties to stream. If not specified, no properties will be streamed.
concurrency (int | None) – Number of concurrent threads to use.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
- Returns:
The streamed relationships [sourceId, targetId, relationshipType] with a column for each property
- Return type:
DataFrame
- abstract to_undirected(G: GraphV2, relationship_type: str, mutate_relationship_type: str, *, aggregation: Aggregation | dict[str, Aggregation] | None = None, concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsToUndirectedResult¶
Creates a new relationship type in the graph. The relationship will be based on an existing relationship type, however, the relationships will be stored undirected.
- Parameters:
G (GraphV2) – Graph object to use
relationship_type (str,) – The input relationship type
mutate_relationship_type (str,) – Name of the relationship type to store the results in.
aggregation (Aggregation | dict[str, Aggregation] | None = None,) – Specifies how to aggregate parallel relationships in the graph. If a single aggregation is provided, it will be used for properties of the specified relationships. A dictionary can be provided to specify property specific aggregations.
concurrency (int | None) – Number of concurrent threads to use.
sudo (bool = False,) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- abstract write(G: GraphV2, relationship_type: str, relationship_properties: list[str] | None = None, *, concurrency: int | None = None, write_concurrency: int | None = None, sudo: bool = False, log_progress: bool = True, username: str | None = None, job_id: str | None = None) RelationshipsWriteResult¶
Writes all relationships of the specified relationship type with the specified properties from the graph to the database.
- Parameters:
G (GraphV2) – Graph object to use
relationship_type (str) – The relationship type to write to the database
relationship_properties (list[str] | None, default = None) – The relationship properties to write. If not specified, no properties will be written.
concurrency (int | None) – Number of concurrent threads to use.
write_concurrency (int | None) – Number of concurrent threads to use for writing.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
job_id (str | None) – Identifier for the computation.
- Returns:
Execution metrics and statistics
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsInverseIndexResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsToUndirectedResult¶
- pydantic model graphdatascience.procedure_surface.api.catalog.RelationshipsWriteResult¶
- Validators:
coerce_relationship_properties»relationship_properties
- class graphdatascience.procedure_surface.api.catalog.ScalePropertiesEndpoints¶
- abstract estimate(G: GraphV2 | dict[str, Any], node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], concurrency: int | None = None) EstimationResult¶
Estimate the memory consumption of an algorithm run.
- Parameters:
G (GraphV2 | dict[str, Any]) – Graph object to use or a dictionary representing the graph dimensions.
node_properties (Any) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
A
ScalerConfiginstance
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
concurrency (int | None) – Number of concurrent threads to use.
- Returns:
Memory estimation details
- Return type:
- abstract mutate(G: GraphV2, mutate_property: str, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ScalePropertiesMutateResult¶
Runs the Scale Properties algorithm and stores the results in the graph catalog as a new node property.
Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).
- Parameters:
G (GraphV2) – Graph object to use
mutate_property (str) – Name of the node property to store the results in.
node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
A
ScalerConfiginstance
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
Algorithm metrics and statistics including the scaler statistics
- Return type:
- abstract stats(G: GraphV2, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) ScalePropertiesStatsResult¶
Runs the Scale Properties algorithm and returns result statistics without storing the results.
Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).
- Parameters:
G (GraphV2) – Graph object to use
node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
A
ScalerConfiginstance
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
Algorithm statistics including the scaler statistics
- Return type:
- abstract stream(G: GraphV2, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None) DataFrame¶
Executes the Scale Properties algorithm and returns a stream of results.
- Parameters:
G (GraphV2) – Graph object to use
node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
A
ScalerConfiginstance
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
- Returns:
DataFrame with nodeId and scaledProperty columns containing scaled property values. Each row represents a node with its corresponding scaled property values.
- Return type:
DataFrame
- abstract write(G: GraphV2, write_property: str, node_properties: list[str], scaler: str | dict[str, str | int | float] | ScalerConfig, node_labels: list[str] = ['*'], sudo: bool = False, log_progress: bool = True, username: str | None = None, concurrency: int | None = None, job_id: str | None = None, write_concurrency: int | None = None) ScalePropertiesWriteResult¶
Runs the Scale Properties algorithm and stores the result in the Neo4j database as a new node property.
Scale Properties scales node properties using a specified scaler (e.g., MinMax, Mean, Max, Log, StdScore, Center).
- Parameters:
G (GraphV2) – Graph object to use
write_property (str) – Name of the node property to store the results in.
node_properties (list[str]) – The node properties to scale. Can be a list of property names or a dictionary mapping property names to configurations.
scaler (str | dict[str, str | int | float] | ScalerConfig) –
The scaler to use. Can be:
A string (e.g., ‘MinMax’, ‘Mean’, ‘Max’, ‘Log’, ‘StdScore’, ‘Center’, ‘NONE’)
A dictionary with scaler configuration (e.g., {‘type’: ‘Log’, ‘offset’: 1.0})
A
ScalerConfiginstance
node_labels (list[str]) – Filter the graph using the given node labels. Nodes with any of the given labels will be included.
sudo (bool) – Disable the memory guard.
log_progress (bool) – Display progress logging.
username (str | None) – As an administrator, impersonate a different user for accessing their graphs.
concurrency (int | None) – Number of concurrent threads to use.
job_id (str | None) – Identifier for the computation.
write_concurrency (int | None) – Number of concurrent threads to use for writing.Returns
-------
ScalePropertiesWriteResult – Algorithm metrics and statistics including the scaler statistics and write timing
- Return type:
- pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesMutateResult¶
Result of running Scale Properties algorithm with mutate mode.
- pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesStatsResult¶
Result of running Scale Properties algorithm with stats mode.
- pydantic model graphdatascience.procedure_surface.api.catalog.ScalePropertiesWriteResult¶
Result of running Scale Properties algorithm with write mode.
- pydantic model graphdatascience.procedure_surface.api.catalog.ScalerConfig¶
- class graphdatascience.graph.v2.graph_api.GraphV2¶
A graph object that represents a graph in the graph catalog. It can be passed into algorithm endpoints to compute over the corresponding graph. It contains summary information about the graph.