Three months ago, I was talking with my friend about the COVID-19 virus.
COVID-19 has taken the whole world under its influence in a very short time. So how is it that the planet we live on can succumb to a virus that spread from only one person in such a short amount of time?
Well, if I told you that you can reach everyone in the world through a maximum of six people by following the path of your friend’s friend, would you better understand how the virus spreads so quickly?
Everything is connected to each other, including the universe itself.
Welcome to the graph world.
Going back to the issue I mentioned above about reaching everyone through a maximum of six people — have you ever heard of the “Six Degrees of Separation” theory?
According to the theory of six degrees of separation, everybody on the planet is on average six or fewer social connections away from each other.
This means that if we follow a chain of “a friend of a friend,” we can reach anyone in a maximum of six steps.
This theory was put to the test by Stanley Milgram in his small-world experiment in 1967, where the goal was to send a letter from Kansas to Boston with the chain of “a friend of a friend.” As expected, the letter reached its destination after five people. Later, this theory was popularized by the 1993 film Six Degrees of Separation, starring Will Smith and Stockard Channing.
Today, with the advent of social media, this number has been reduced from an average of six people to 2.9 to 4.2 people in the chain. This data is backed by research done by Facebook and other big social media platforms.
Now, if you like, let’s put this story into mathematical terms.
As I’m a big fan of Will Smith, let’s continue with him. Let’s say you want to meet him and you want to find out who you can reach him through.
To do this, of course, you need to look at the accounts Will Smith follows.
Will Smith follows 200 people. If there is someone in your followers among those 200 people, bingo! You can reach Will Smith through that person.
Otherwise, what you need to do this time is look at the accounts of those 200 people and look at all the accounts they follow one by one.
Let’s say that each profile you look at follows an average of 200 accounts again, so the number of accounts you need to look at in the second round will be 200 * 200 = 40,000 people.
If there is someone in your followers among those 40,000 people, bingo again! You can reach Will Smith through two people!
Otherwise, you need to look at the accounts of those 40,000 people, as well as all of the accounts they follow one by one.
Again, assuming that each profile follows an average of 200 people, the number of accounts you need to look at in the third round will be 40,000 * 200 = 8 million people.
If we continue like this, the number of accounts you need to look at in the forth round will be 1.6 billion. Oops! You’ve already surpassed the Instagram population!
Let’s say you use Instagram so fast that it takes only one second to look at each profile. It will take you approximately 50 years to look at the 1.6 billion profiles in the forth round.
That night, I searched for a website or app where I could test the “six degrees of separation” theory. I found a few websites:
The Oracle of Bacon
It basically links any actor to any other according to the common movies they have made together.
The Erdős number describes the “collaborative distance” between mathematician Paul Erdős and another mathematician, as measured by authorship of mathematical papers.
With the exception of those two, unfortunately I could not find any live examples related to this subject. That was the night I started coding Pathica.
Pathica is an app built upon this revolutionary theory that — for the first time ever — allows you to put this amazing theory to the test.
The logic of the application is very simple. You search for the account you want to meet (it does not have to be of a famous person; you can search for anyone), or you choose one of the ready-made lists on the homepage. When you press the connect button, Pathica finds the people who are between you and that person and shows them to you. For example, in the example above, you can see that there are only three people between Will Smith and myself. I can easily reach Will Smith by following this path of three people.
The world is really small, isn’t it?
Now, if we go back to mathematics again, in the story I told above, it takes 50 years for one person to analyze 1.6 billion accounts, while this process takes about 0.03 seconds in Pathica (yes, 1/30 of a second). It is almost impossible to do these operations in relational or NoSQL databases in such a short time, so I decided to use Neo4j as a graph database.
There are only two data types in graph databases — one is a “node” and the other is the “relation.” Nodes are connected to each other through relations. You can create as many node and relation types as you like.
Thinking from an aspect with regard to Pathica, you can think of nodes as “people” and relations as “following.” Currently, the node and relations numbers in the Pathica database are as follows:
MATCH (n:User) RETURN count(n) as total_users
total_users => 373809936
MATCH ()-[r:follows]->() RETURN count(r) as total_relations
total_relations => 2199487555
373 million users and around 2.2 billion relations between users. It’s kinda huge, right? When I query the shortest connection path between two people in my Neo4j database, it only took around 0.03 seconds to analyze connections in this huge data. That’s why I love Neo4J so much ❤️
They have great community support in their support forum and also in their discord channel. They have a startup support program also. This is what I receive after I applied to it 🙂
Hello isa, We’d like to welcome you to the Neo4j Startup Program. Your application is approved and you now have access to the Enterprise Edition of the world’s leading graph database.
As I finish my article, I would like to thank the Neo4j family for making my dream come true in proving the “six degrees of separation” theory, which has been talked about, experimented with, filmed, and contested on talk show programs for more than 50 years.
In my next article, I plan to address the problems and some technical issues I encountered while dealing with such big data.
About me: While studying at the Department of Mathematics in Boğaziçi University, I left the university in my 3rd year and found myself in the world of mobile applications. Since then, I have been actively writing code for 15 years, I have been the CTO of one of the biggest companies producing mobile applications in Turkey, Teknasyon, and for the last two years, I have been working as a single-man company in my home-office trying to produce something by myself.
First Proof of “Six Degrees Of Separation” was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.