Memory Requirements Estimation

Memory Requirements estimation

The Graph Data Science library operates entirely on heap memory. To avoid running out of memory when dealing with large networks, you can estimate the required memory before executing the algorithms. GDSL supports estimating the required memory to load a named graph as well as to execute a graph algorithm by using the estimate mode.

The general syntax is as follows:

CALL gds.<ALGO>.<MODE>.estimate()

All algorithm procedures in the GDSL, including graph creation, will do an estimation check at the beginning of their execution. If the estimation check determines that the current amount of free memory is insufficient to complete the operation, the operation will be aborted, and an error will be reported. This heap control logic is restrictive in the sense that it only blocks executions that are certain to not fit into memory. It does not guarantee that an execution that passed the heap control will succeed without depleting memory. Thus, it is still useful to first run the estimation mode before running an algorithm or graph creation on a large data set to ensure you will not run out of memory.

As of this writing, not all algorithms in the GDSL can be estimated. Algorithms in the Production-quality tier can all be estimated. Some algorithms in the alpha and beta tier can be estimated.

You can see the list here of algorithms that have an estimate method.

The amount of free heap memory can be increased by either dropping unused named graphs from the Graph Catalog, or increasing the maximum heap size before starting the Neo4j instance.

Exercise: Memory requirements estimation

In Neo4j Browser: :play 4.0-intro-graph-algos-exercises and follow the instructions for Memory requirements.

Estimated time to complete: 5 minutes

Check your understanding

Question 1

What algorithms can you estimate heap memory for?

Select the correct answers.

  • Betweenness Centrality

  • Link Prediction

  • PageRank

  • Label Propagation

Question 2

When calling estimate() to provide the heap memory estimation for running the algorithm, what information must you provide?

Select the correct answers.

  • name of the algorithm

  • mode of the algorithm (stream, write)

  • amount of heap configured for the instance

  • size of the graph

Question 3

What factors impact the amount of memory available for an algorithm?

Select the correct answers.

  • the size of the projected graph

  • the amount of heap available to the instance

  • the number of previously executed algorithms

  • the particular algorithm that will run

Summary

In this lesson you learned how to estimate the memory requirements for you graph algorithm analysis.