Machine learning procedures¶

Listing of all machine learning procedures in the Neo4j Graph Data Science Python Client API. This includes running embedding algorithms and creating various pipelines. These all assume that an object of GraphDataScience is available as gds.

gds.pipeline.get(pipeline_name: str) → TrainingPipeline[PipelineModel]¶: Get a pipeline object representing a pipeline in the Pipeline Catalog.

gds.alpha.ml.splitRelationships.mutate(G: Graph, **config: Any) → Series[Any]¶: Splits a graph into holdout and remaining relationship types and adds them to the graph.

gds.alpha.ml.splitRelationships.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: Splits a graph into holdout and remaining relationship types and adds them to the graph.

gds.alpha.pipeline.nodeRegression.create(name: str) → Tuple[NRTrainingPipeline, Series[Any]]¶: Creates a node regression training pipeline in the pipeline catalog.

gds.beta.graphSage.mutate(G: Graph, **config: Any) → Series[Any]¶: The GraphSage algorithm inductively computes embeddings for nodes based on a their features and neighborhoods.

gds.beta.graphSage.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: The GraphSage algorithm inductively computes embeddings for nodes based on a their features and neighborhoods.

gds.beta.graphSage.stream(G: Graph, **config: Any) → DataFrame¶: The GraphSage algorithm inductively computes embeddings for nodes based on a their features and neighborhoods.

gds.beta.graphSage.stream.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

gds.beta.graphSage.train(G: Graph, **config: Any) → Tuple[MODEL_TYPE, Series[Any]]¶: The GraphSage algorithm inductively computes embeddings for nodes based on a their features and neighborhoods.

gds.beta.graphSage.train.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

gds.beta.graphSage.write(G: Graph, **config: Any) → Series[Any]¶: The GraphSage algorithm inductively computes embeddings for nodes based on a their features and neighborhoods.

gds.beta.graphSage.write.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.hashgnn.mutate() instead.

gds.beta.hashgnn.mutate(G: Graph, **config: Any) → Series[Any]¶: HashGNN creates node embeddings by hashing and message passing.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.hashgnn.mutate.estimate() instead.

gds.beta.hashgnn.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: HashGNN creates node embeddings by hashing and message passing.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.hashgnn.stream() instead.

gds.beta.hashgnn.stream(G: Graph, **config: Any) → DataFrame¶: HashGNN creates node embeddings by hashing and message passing.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.hashgnn.stream.estimate() instead.

gds.beta.hashgnn.stream.estimate(G: Graph, **config: Any) → Series[Any]¶: HashGNN creates node embeddings by hashing and message passing.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.mutate() instead.

gds.beta.node2vec.mutate(G: Graph, **config: Any) → Series[Any]¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.mutate.estimate() instead.

gds.beta.node2vec.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.stream() instead.

gds.beta.node2vec.stream(G: Graph, **config: Any) → DataFrame¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.stream.estimate() instead.

gds.beta.node2vec.stream.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.write() instead.

gds.beta.node2vec.write(G: Graph, **config: Any) → Series[Any]¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.node2vec.write.estimate() instead.

gds.beta.node2vec.write.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

gds.beta.pipeline.drop(pipeline: TrainingPipeline[PipelineModel]) → Series[Any]¶: Drops a pipeline and frees up the resources it occupies.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.pipeline.drop() instead.

gds.beta.pipeline.exists(pipeline_name: str) → Series[Any]¶: Checks if a given pipeline exists in the pipeline catalog.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.pipeline.exists() instead.

gds.beta.pipeline.list(pipeline: TrainingPipeline[PipelineModel] | None = None) → DataFrame¶: Lists all pipelines contained in the pipeline catalog.

Deprecated since version 2.5.0: Since GDS server version 2.5.0 you should use the endpoint gds.pipeline.list() instead.

gds.pipeline.drop(pipeline: TrainingPipeline[PipelineModel]) → Series[Any]¶: Drops a pipeline and frees up the resources it occupies.

gds.pipeline.exists(pipeline_name: str) → Series[Any]¶: Checks if a given pipeline exists in the pipeline catalog.

gds.pipeline.list(pipeline: TrainingPipeline[PipelineModel] | None = None) → DataFrame¶: Lists all pipelines contained in the pipeline catalog.

gds.beta.pipeline.linkPrediction.create(name: str) → Tuple[LPTrainingPipeline, Series[Any]]¶: Creates a link prediction pipeline in the pipeline catalog.

gds.beta.pipeline.nodeClassification.create(name: str) → Tuple[NCTrainingPipeline, Series[Any]]¶: Creates a node classification training pipeline in the pipeline catalog.

gds.fastRP.mutate(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.stats(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.stats.estimate(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.stream(G: Graph, **config: Any) → DataFrame¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.stream.estimate(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.write(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.fastRP.write.estimate(G: Graph, **config: Any) → Series[Any]¶: Random Projection produces node embeddings via the fastrp algorithm

gds.alpha.ml.oneHotEncoding(available_values: List[Any], selected_values: List[Any]) → List[int]¶: Return a list of selected values in a one hot encoding format.

gds.hashgnn.mutate(G: Graph, **config: Any) → Series[Any]¶: HashGNN creates node embeddings by hashing and message passing.

gds.hashgnn.mutate.estimate(G: Graph, **config: Any) → DataFrame¶: Returns an estimation of the memory consumption for that procedure.

gds.hashgnn.stream(G: Graph, **config: Any) → DataFrame¶: HashGNN creates node embeddings by hashing and message passing.

gds.hashgnn.stream.estimate(G: Graph, **config: Any) → DataFrame¶: Returns an estimation of the memory consumption for that procedure.

gds.hashgnn.write(G: Graph, **config: Any) → DataFrame¶: HashGNN creates node embeddings by hashing and message passing.

gds.hashgnn.write.estimate(G: Graph, **config: Any) → DataFrame¶: Returns an estimation of the memory consumption for that procedure.

gds.node2vec.mutate(G: Graph, **config: Any) → Series[Any]¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

gds.node2vec.mutate.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

gds.node2vec.stream(G: Graph, **config: Any) → DataFrame¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

gds.node2vec.stream.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.

gds.node2vec.write(G: Graph, **config: Any) → Series[Any]¶: The Node2Vec algorithm computes embeddings for nodes based on random walks.

gds.node2vec.write.estimate(G: Graph, **config: Any) → Series[Any]¶: Returns an estimation of the memory consumption for that procedure.