Greedy
This section describes the Greedy influence maximization algorithm in the Neo4j Graph Data Science library.
This algorithm is in the alpha tier. For more information on algorithm tiers, see Algorithms.
This topic includes:
1. Introduction
The Greedy algorithm for influence maximization aims to find k
nodes that maximize the expected spread of influence in a network.
It simulates the influence spread using the Independent Cascade model, which calculates the expected spread by taking the average spread over the mc
MonteCarlo simulations.
In the propagation process, a node is influenced in case that a uniform random draw is less than the probability p
.
Kempe et al. 2003 introduced the Greedy algorithm in their study Maximizing the Spread of Influence through a Social Network to deal with the NPhard problem of influence maximization. The Greedy algorithm successively selecting the node within the maximum marginal gain approximation in polynomial time. For large networks CELF algorithm should be used.
2. Syntax
This algorithm is in the alpha tier. For more information on algorithm tiers, see Algorithms.
CALL gds.alpha.influenceMaximization.greedy.stream(
graphName: String,
configuration: Map
)
YIELD
nodeId: Integer,
spread : Float
Name  Type  Default  Optional  Description 

graphName 
String 

no 
The name of a graph stored in the catalog. 
configuration 
Map 

yes 
Configuration for algorithmspecifics and/or graph filtering. 
Name  Type  Default  Optional  Description 

List of String 

yes 
Filter the named graph using the given node labels. 

List of String 

yes 
Filter the named graph using the given relationship types. 

Integer 

yes 
The number of concurrent threads used for running the algorithm. 
Name  Type  Default  Optional  Description 

seedSetSize 
Integer 
n/a 
no 
The number of nodes that maximize the expected spread in the network. 
monteCarloSimulations 
Integer 
1000 
yes 
The number of MonteCarlo simulations. 
propagationProbability 
Float 
0.1 
yes 
The probability of a node being activated by an active neighbour node. 
Name  Type  Description 

nodeId 
Integer 
Node ID. 
spread 
Float 
The spread gained by selecting the node. 
CALL gds.alpha.influenceMaximization.greedy.stats(
graphName: String,
configuration: Map
)
YIELD
nodes: Integer,
computeMillis: Integer,
Name  Type  Default  Optional  Description 

graphName 
String 

no 
The name of a graph stored in the catalog. 
configuration 
Map 

yes 
Configuration for algorithmspecifics and/or graph filtering. 
Name  Type  Default  Optional  Description 

List of String 

yes 
Filter the named graph using the given node labels. 

List of String 

yes 
Filter the named graph using the given relationship types. 

Integer 

yes 
The number of concurrent threads used for running the algorithm. 
Name  Type  Default  Optional  Description 

seedSetSize 
Integer 
n/a 
no 
The number of nodes that maximize the expected spread in the network. 
monteCarloSimulations 
Integer 
1000 
yes 
The number of MonteCarlo simulations. 
propagationProbability 
Float 
0.1 
yes 
The probability of a node being activated by an active neighbour node. 
Name  Type  Description 

nodes 
Integer 
The number of nodes in the graph. 
computeMillis 
Integer 
Milliseconds for running the algorithm. 
3. Examples
In this section we will show examples of running the Greedy algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide in how to make use of the algorithm in a real setting. We will do this on a small social network graph of a handful nodes connected in a particular pattern. The example graph looks like this:
CREATE
(a:Person {name: 'Jimmy'}),
(b:Person {name: 'Jack'}),
(c:Person {name: 'Alice'}),
(d:Person {name: 'Ceri'}),
(e:Person {name: 'Mohammed'}),
(f:Person {name: 'Michael'}),
(g:Person {name: 'Ethan'}),
(h:Person {name: 'Lara'}),
(i:Person {name: 'Amir'}),
(j:Person {name: 'Willie'}),
(b)[:FRIEND_OF]>(c),
(c)[:FRIEND_OF]>(a),
(c)[:FRIEND_OF]>(g),
(c)[:FRIEND_OF]>(h),
(c)[:FRIEND_OF]>(i),
(c)[:FRIEND_OF]>(j),
(d)[:FRIEND_OF]>(g),
(f)[:FRIEND_OF]>(e),
(f)[:FRIEND_OF]>(g),
(g)[:FRIEND_OF]>(a),
(g)[:FRIEND_OF]>(b),
(g)[:FRIEND_OF]>(h),
(g)[:FRIEND_OF]>(e),
(h)[:FRIEND_OF]>(i);
In the example, we will use the Greedy algorithm to find k
nodes subset.
CALL gds.graph.create(
'myGraph',
'Person',
'FRIEND_OF'
);
In the following examples we will demonstrate using the Greedy algorithm on this graph.
3.1. Stream
In the stream
execution mode, the algorithm returns the spread for each node.
This allows us to inspect the results directly or postprocess them in Cypher without any side effects.
For more details on the stream
mode in general, see Stream.
CALL gds.alpha.influenceMaximization.greedy.stream('myGraph', {seedSetSize: 3, concurrency: 4})
YIELD nodeId, spread
RETURN gds.util.asNode(nodeId).name AS Name, spread
ORDER BY spread ASC
Name  spread 

"Alice" 
1.519 
"Ethan" 
2.701 
"Michael" 
3.8 
Was this page helpful?