GraphGist: Preview

[Warning]Warning

This GraphGist has not yet been submitted and approved for publication. If you're the developer, please submit for publication using the GraphGist Portal.

The goal is to calculate similarity between matched Persons by the number of times they have eaten same food. Maximum number of Persons to match would be limited to 1000 Persons and maximum different Foods would be 100 - So 1000*100=100k maximum relationships to traverse and calculate. Data set in Cypher dump with 1500 Persons, 100 Foods and 150k Relationships can be downloaded here (zip 503kb)

Initial sample data setup

10 Foods and 20 Persons (200 ATE relationships)

Running queries, preparing the console!

CREATE
(_0:Food {name:"corrupti"}),
(_1:Food {name:"voluptas"}),
(_2:Food {name:"sapiente"}),
(_3:Food {name:"ducimus"}),
(_4:Food {name:"debitis"}),
(_5:Food {name:"esse"}),
(_6:Food {name:"quasi"}),
(_7:Food {name:"et"}),
(_8:Food {name:"ex"}),
(_9:Food {name:"est"}),
(_10:Food {name:"non"}),
(_11:Food {name:"sit"}),
(_12:Food {name:"sed"}),
(_13:Food {name:"placeat"}),
(_14:Food {name:"necessitatibus"}),
(_15:Food {name:"veniam"}),
(_16:Food {name:"sed"}),
(_17:Food {name:"esse"}),
(_18:Food {name:"pariatur"}),
(_19:Food {name:"sapiente"}),
(_20:Person {name:"Rudolph"}),
(_21:Person {name:"Kailee"}),
(_22:Person {name:"Archibald"}),
(_23:Person {name:"Toby"}),
(_24:Person {name:"Ronaldo"}),
(_25:Person {name:"Albina"}),
(_26:Person {name:"Daniella"}),
(_27:Person {name:"Jesse"}),
(_28:Person {name:"Libbie"}),
(_29:Person {name:"Lance"}),
(_30:Person {name:"Kaylie"}),
(_31:Person {name:"Ava"}),
(_32:Person {name:"Lenna"}),
(_33:Person {name:"Madyson"}),
(_34:Person {name:"Terrance"}),
(_35:Person {name:"Weston"}),
(_36:Person {name:"Eda"}),
(_37:Person {name:"Meaghan"}),
(_38:Person {name:"Oma"}),
(_39:Person {name:"Ernest"}),
_20-[:ATE {times:6}]->_10,
_20-[:ATE {times:1}]->_11,
_20-[:ATE {times:10}]->_12,
_20-[:ATE {times:4}]->_13,
_20-[:ATE {times:7}]->_14,
_20-[:ATE {times:5}]->_15,
_20-[:ATE {times:9}]->_16,
_20-[:ATE {times:1}]->_17,
_20-[:ATE {times:2}]->_18,
_20-[:ATE {times:5}]->_19,
_21-[:ATE {times:6}]->_10,
_21-[:ATE {times:10}]->_11,
_21-[:ATE {times:6}]->_12,
_21-[:ATE {times:2}]->_13,
_21-[:ATE {times:7}]->_14,
_21-[:ATE {times:6}]->_15,
_21-[:ATE {times:10}]->_16,
_21-[:ATE {times:7}]->_17,
_21-[:ATE {times:6}]->_18,
_21-[:ATE {times:2}]->_19,
_22-[:ATE {times:7}]->_10,
_22-[:ATE {times:7}]->_11,
_22-[:ATE {times:7}]->_12,
_22-[:ATE {times:1}]->_13,
_22-[:ATE {times:10}]->_14,
_22-[:ATE {times:7}]->_15,
_22-[:ATE {times:10}]->_16,
_22-[:ATE {times:9}]->_17,
_22-[:ATE {times:10}]->_18,
_22-[:ATE {times:9}]->_19,
_23-[:ATE {times:6}]->_10,
_23-[:ATE {times:4}]->_11,
_23-[:ATE {times:5}]->_12,
_23-[:ATE {times:10}]->_13,
_23-[:ATE {times:9}]->_14,
_23-[:ATE {times:8}]->_15,
_23-[:ATE {times:6}]->_16,
_23-[:ATE {times:2}]->_17,
_23-[:ATE {times:5}]->_18,
_23-[:ATE {times:2}]->_19,
_24-[:ATE {times:8}]->_10,
_24-[:ATE {times:6}]->_11,
_24-[:ATE {times:10}]->_12,
_24-[:ATE {times:6}]->_13,
_24-[:ATE {times:9}]->_14,
_24-[:ATE {times:10}]->_15,
_24-[:ATE {times:7}]->_16,
_24-[:ATE {times:2}]->_17,
_24-[:ATE {times:9}]->_18,
_24-[:ATE {times:8}]->_19,
_25-[:ATE {times:9}]->_10,
_25-[:ATE {times:7}]->_11,
_25-[:ATE {times:7}]->_12,
_25-[:ATE {times:9}]->_13,
_25-[:ATE {times:9}]->_14,
_25-[:ATE {times:3}]->_15,
_25-[:ATE {times:10}]->_16,
_25-[:ATE {times:9}]->_17,
_25-[:ATE {times:7}]->_18,
_25-[:ATE {times:6}]->_19,
_26-[:ATE {times:9}]->_10,
_26-[:ATE {times:1}]->_11,
_26-[:ATE {times:2}]->_12,
_26-[:ATE {times:6}]->_13,
_26-[:ATE {times:3}]->_14,
_26-[:ATE {times:10}]->_15,
_26-[:ATE {times:7}]->_16,
_26-[:ATE {times:1}]->_17,
_26-[:ATE {times:10}]->_18,
_26-[:ATE {times:7}]->_19,
_27-[:ATE {times:5}]->_10,
_27-[:ATE {times:2}]->_11,
_27-[:ATE {times:10}]->_12,
_27-[:ATE {times:4}]->_13,
_27-[:ATE {times:4}]->_14,
_27-[:ATE {times:9}]->_15,
_27-[:ATE {times:4}]->_16,
_27-[:ATE {times:5}]->_17,
_27-[:ATE {times:7}]->_18,
_27-[:ATE {times:7}]->_19,
_28-[:ATE {times:6}]->_10,
_28-[:ATE {times:7}]->_11,
_28-[:ATE {times:7}]->_12,
_28-[:ATE {times:3}]->_13,
_28-[:ATE {times:2}]->_14,
_28-[:ATE {times:6}]->_15,
_28-[:ATE {times:8}]->_16,
_28-[:ATE {times:2}]->_17,
_28-[:ATE {times:1}]->_18,
_28-[:ATE {times:1}]->_19,
_29-[:ATE {times:6}]->_10,
_29-[:ATE {times:8}]->_11,
_29-[:ATE {times:8}]->_12,
_29-[:ATE {times:7}]->_13,
_29-[:ATE {times:8}]->_14,
_29-[:ATE {times:2}]->_15,
_29-[:ATE {times:3}]->_16,
_29-[:ATE {times:10}]->_17,
_29-[:ATE {times:6}]->_18,
_29-[:ATE {times:7}]->_19,
_30-[:ATE {times:5}]->_10,
_30-[:ATE {times:4}]->_11,
_30-[:ATE {times:8}]->_12,
_30-[:ATE {times:9}]->_13,
_30-[:ATE {times:2}]->_14,
_30-[:ATE {times:7}]->_15,
_30-[:ATE {times:3}]->_16,
_30-[:ATE {times:10}]->_17,
_30-[:ATE {times:3}]->_18,
_30-[:ATE {times:5}]->_19,
_31-[:ATE {times:3}]->_10,
_31-[:ATE {times:6}]->_11,
_31-[:ATE {times:9}]->_12,
_31-[:ATE {times:8}]->_13,
_31-[:ATE {times:10}]->_14,
_31-[:ATE {times:4}]->_15,
_31-[:ATE {times:6}]->_16,
_31-[:ATE {times:6}]->_17,
_31-[:ATE {times:8}]->_18,
_31-[:ATE {times:2}]->_19,
_32-[:ATE {times:2}]->_10,
_32-[:ATE {times:6}]->_11,
_32-[:ATE {times:8}]->_12,
_32-[:ATE {times:10}]->_13,
_32-[:ATE {times:1}]->_14,
_32-[:ATE {times:1}]->_15,
_32-[:ATE {times:3}]->_16,
_32-[:ATE {times:5}]->_17,
_32-[:ATE {times:9}]->_18,
_32-[:ATE {times:7}]->_19,
_33-[:ATE {times:8}]->_10,
_33-[:ATE {times:3}]->_11,
_33-[:ATE {times:8}]->_12,
_33-[:ATE {times:7}]->_13,
_33-[:ATE {times:6}]->_14,
_33-[:ATE {times:9}]->_15,
_33-[:ATE {times:1}]->_16,
_33-[:ATE {times:8}]->_17,
_33-[:ATE {times:2}]->_18,
_33-[:ATE {times:9}]->_19,
_34-[:ATE {times:9}]->_10,
_34-[:ATE {times:8}]->_11,
_34-[:ATE {times:10}]->_12,
_34-[:ATE {times:3}]->_13,
_34-[:ATE {times:6}]->_14,
_34-[:ATE {times:7}]->_15,
_34-[:ATE {times:8}]->_16,
_34-[:ATE {times:3}]->_17,
_34-[:ATE {times:6}]->_18,
_34-[:ATE {times:8}]->_19,
_35-[:ATE {times:8}]->_10,
_35-[:ATE {times:9}]->_11,
_35-[:ATE {times:9}]->_12,
_35-[:ATE {times:1}]->_13,
_35-[:ATE {times:8}]->_14,
_35-[:ATE {times:1}]->_15,
_35-[:ATE {times:7}]->_16,
_35-[:ATE {times:1}]->_17,
_35-[:ATE {times:6}]->_18,
_35-[:ATE {times:10}]->_19,
_36-[:ATE {times:3}]->_10,
_36-[:ATE {times:1}]->_11,
_36-[:ATE {times:3}]->_12,
_36-[:ATE {times:3}]->_13,
_36-[:ATE {times:4}]->_14,
_36-[:ATE {times:1}]->_15,
_36-[:ATE {times:6}]->_16,
_36-[:ATE {times:6}]->_17,
_36-[:ATE {times:1}]->_18,
_36-[:ATE {times:10}]->_19,
_37-[:ATE {times:8}]->_10,
_37-[:ATE {times:8}]->_11,
_37-[:ATE {times:8}]->_12,
_37-[:ATE {times:7}]->_13,
_37-[:ATE {times:10}]->_14,
_37-[:ATE {times:2}]->_15,
_37-[:ATE {times:9}]->_16,
_37-[:ATE {times:8}]->_17,
_37-[:ATE {times:6}]->_18,
_37-[:ATE {times:2}]->_19,
_38-[:ATE {times:4}]->_10,
_38-[:ATE {times:7}]->_11,
_38-[:ATE {times:10}]->_12,
_38-[:ATE {times:10}]->_13,
_38-[:ATE {times:10}]->_14,
_38-[:ATE {times:9}]->_15,
_38-[:ATE {times:5}]->_16,
_38-[:ATE {times:6}]->_17,
_38-[:ATE {times:1}]->_18,
_38-[:ATE {times:10}]->_19,
_39-[:ATE {times:3}]->_10,
_39-[:ATE {times:10}]->_11,
_39-[:ATE {times:3}]->_12,
_39-[:ATE {times:7}]->_13,
_39-[:ATE {times:2}]->_14,
_39-[:ATE {times:9}]->_15,
_39-[:ATE {times:6}]->_16,
_39-[:ATE {times:1}]->_17,
_39-[:ATE {times:4}]->_18,
_39-[:ATE {times:4}]->_19;

Slightly modified query from cookbook to use Labels

This query performs very poorly with LIMIT of 1000 (100k relationships) because of LIMIT applied after whole database traversal.

MATCH (me:Person { name: "Libbie" })-[r1:ATE]->(food)<-[r2:ATE]-(you:Person)
WITH me, count(DISTINCT r1) AS H1, count(DISTINCT r2) AS H2, you
MATCH (me)-[r1:ATE]->(food)<-[r2:ATE]-(you)
RETURN me.name, you.name, SUM((1-ABS(r1.times/H1-r2.times/H2))*(r1.times+r2.times)/(H1+H2)) AS similarity
ORDER BY similarity DESC
LIMIT 10;

Initial try of limiting Persons to calculate match against

This query performs poorly with LIMIT of 1000*100 (100k relationships) - around ~16000ms

MATCH (me:Person { name: "Libbie" })-[r1:ATE]->(food)
WITH me, food
MATCH (food)<-[r2:ATE]-(you)
WHERE me <> you
WITH me, you
LIMIT 10
MATCH (me)-[r1:ATE]->(food)<-[r2:ATE]-(you)
WITH me, count(DISTINCT r1) AS H1, count(DISTINCT r2) AS H2, you
MATCH (me)-[r1:ATE]->(food)<-[r2:ATE]-(you)
WITH me, you, SUM((1-ABS(r1.times/H1-r2.times/H2))*(r1.times+r2.times)/(H1+H2)) AS similarity
RETURN me.name, you.name, similarity
ORDER BY similarity DESC;

Suggested query by Michael

MATCH (me:Person { name: "Libbie"})-[r1:ATE]->(food)<-[r2:ATE]-(you:Person)
WITH me,
     collect([r1,r2]) as rels,
     count(DISTINCT r1) AS H1,
     count(DISTINCT r2) AS H2,
     you

RETURN me, you,
     reduce(a=0, r in rels |
        a + (1-ABS(r[0].times/H1-r[1].times/H2)) * (r[0].times+r[1].times) / (H1+H2)
     ) as similarity
Run
Table
Graph
Table!
Graph!
Error!
Loading