Graph Data Science integration

Neo4j Graph Data Science algorithms can help you find new insights in your data, both into the nodes themselves as well as into how they are connected. See The Neo4j Graph Data Science Library Manual v2.1 for more information on graph algorithms.

Bloom currently allows you to run four different algorithms on the data in the Scene:

  • Degree Centrality

  • Betweenness Centrality

  • Louvain

  • Weakly Connected Components

Running Graph Data Science algorithms on elements in your Scene does not alter the underlying data. The scores only exist temporarily in Bloom.

The algorithms are described briefly below, but please refer to The Neo4j Graph Data Science Library Manual v2.1 for their full descriptions.

Available GDS algorithms in Bloom

Degree Centrality

The Degree Centrality algorithm measures the relationships connected to a node, either incoming, outgoing, or both, to find the most connected nodes in a graph. It can be used to determine popularity, for example in the Northwind dataset, it would be possible to find the most-ordered products, or which customers place the most orders.

Betweenness Centrality

The Betweenness Centrality algorithm finds influential nodes, that is, nodes that are thoroughfares for the most shortest-paths in the scene. Nodes with a high degree of betweenness centrality are nodes that connect different sub-parts of a graph. In the Northwind example, it can be used to find suppliers that supply the most products to the most customers, and thus foresee possible bottlenecks should something happen to these suppliers.

Louvain

The Louvain algorithm aims to find clusters of highly connected nodes within a larger network, also known as community detection. It can be useful for product recommendations. In the Northwind example, if you know a customer has bought something from a group of products found through the Louvain algorithm, they may be likely to also want to purchase other products from that group.

Weakly Connected Components

The Weakly Connected Components algorithm finds subgraphs that are unreachable from other parts of the graph. It can be used to determine whether your network is fully connected or not. In the Northwind example, the WCC algorithm can be useful in terms of supply chain disruptions, where the algorithm helps you see which other elements are affected and which ones are not.

Using GDS algorithms in Bloom

Prerequisites

To use GDS algorithms in Bloom, there are two things you need to do before you start Bloom:

  • Install the Graph Data Science Library plugin. The easiest way to do this is in Neo4j Desktop. See the Install a plugin section in the Neo4j Desktop manual for more information.

  • Allow GDS in the neo4j.conf file. This can be done manually or via Neo4j Desktop. The dbms.security.procedures.unrestricted setting needs to include both Bloom and GDS (and others that are already specified) as such: dbms.security.procedures.unrestricted=jwt.security.*,bloom.*,gds.*,apoc.*

    The dbms.security.procedures.allowlist setting needs to be uncommented and also needs to include both Bloom and GDS (and others, as mentioned previously) as such: dbms.security.procedures.allowlist=apoc.coll.*,apoc.load.*,gds.*,bloom.*,apoc.*

With these in place, you can start Bloom and start searching to bring some data to your Scene to run the algorithms on.

louvain

Running the algorithms

The GDS algorithms are accessed via the GDS button in the upper-left corner of the Scene. When you have selected an appropriate algorithm, you have the option to run it on all elements in the Scene, or specify which node categories and/or relationship types. Additionally, you can also select the orientation of the relationships to be traversed. The options are accessed via the Settings button in the GDS drawer.

Applying your selected algorithm does not immediately change anything in the Scene. You can inspect each node to see its score, but to make the results easily visible, apply rule-based styling. This is done directly in the GDS drawer. The Degree Centrality and Betweenness Centrality algorithms are based on a range of values and can be either size-scaled or color gradient, while the Louvain and WCC algorithms use unique values and offer unique colors to style the nodes.

degree centrality