GraphGists

For people who doesn’t know, Eurovision is a song contest amongs European countries. The particularity is that countries must vote to each other to determine the winner.

giphy

But, it’s always the same impression each year : Sounds like there is countries that works in gang. Can we somehow reveal this feeling with a little help of Graph ? Let’s see :

The model is very simple

Eurovision

Here is the harvest of data that represents all the Eurovision votes in the 2012, 2013, 2014 and 2015 final session.

Comment : a Voting Country can also be a Candidate Country.

For people that notices "Autralia", it’s not a typo with "Austria", Australia did participate to Eurovision as candidate in 2015.

Who participated, which year was it and as candidate or voter ?

Let’s start simple by checking who participated at which final and who was only voting.

MATCH (country)-[v:VOTE]->()
WITH country, count(DISTINCT v.year) AS nb_voting ,collect(DISTINCT v.year) AS year_voting
MATCH ()-[p:VOTE]->(country)
WITH country, nb_voting,year_voting, count(DISTINCT p.year) AS nb_participation ,collect(DISTINCT p.year) AS year_participation
RETURN country.name,nb_voting,year_voting,nb_participation,year_participation
ORDER BY nb_participation DESC

Vote distribution visualization within a year

Quick graph to represent the vote distribution within a year (here 2015).

MATCH path = (voter:Country)-[v:VOTE]->(candidate:Country)
WHERE v.year = 2015 AND v.score >= 8
RETURN path

Best voter

Which voter always votes for the winner ?

MATCH (voter)-[vote:VOTE]->(country)
WHERE vote.score = 12 AND country.winner = "Yes"
WITH voter, COUNT(DISTINCT vote.year) AS nb_year , COLLECT(DISTINCT vote.year) AS years
WHERE nb_year = 4
RETURN voter.name,years
ORDER BY nb_year DESC

If you need to bet, UK looks a very good advisor :-D

Impact of votes ?

If we just gives the 8, 10, 12 points, does it change the rank per year ?

MATCH ()-[up:VOTE]->(country)
WHERE up.score >= 8
WITH country, up.year as year, SUM(up.score) AS up_score
MATCH ()-[all:VOTE]->(country)
WHERE all.year = year
WITH country, year, up_score , SUM(all.score) as all_score
RETURN country.name,year, all_score,up_score
ORDER BY year,all_score DESC

So, sounds like the votes under 8 points doesn’t really influence the rank. Maybe some place are switch in the top 5, but no drastic change.

Diversity in votes

As we saw previously, the most important points given are the famous 8, 10 and 12. How theses votes are distributed per country each year ?

Let’s see how many different candidates a country voted for.

MATCH (a)-[e:VOTE]->(b)
WHERE e.score >= 8
WITH a, count(DISTINCT e.year) AS years, collect(DISTINCT b.name) AS candidate, count(DISTINCT b.name) AS nb_candidate,toFloat(count(DISTINCT b.name))/ toFloat(count(DISTINCT e.year)*3) AS ratio
WHERE ratio <> 1
RETURN a.name,candidate,nb_candidate,years,ratio
ORDER BY ratio ASC

Details

MATCH (voter)-[e:VOTE]->(candidate)
WHERE e.score >= 8
WITH voter, candidate, collect(DISTINCT e.year) AS years, COUNT(DISTINCT e.year) as nb_year
RETURN voter.name,candidate.name,years
ORDER BY nb_year DESC,voter.name

The ratio shows how "diverse" the votes of a country was during the 4 sessions. Most of countries almost voted different countries each time. But a few of them doesn’t have this variety of votes, and they almost votes the same candidate.

Visualization

MATCH w WHERE w.winner = "Yes" REMOVE w:Country SET w:Winner;
MATCH (a)-[e:VOTE]->(b)
WHERE e.score >= 8
WITH a,b, count(DISTINCT e.year) AS years
WHERE years >= 2
CREATE p= (a)-[r:SIMILAR_VOTE {nb:years}]->(b)

RETURN p

12 points go to …​ Foaf:

In this query, we will try to check the mutual votes per year where the score of both votes is up to 8.

MATCH (a)-[e:VOTE]->(b),(b)-[f:VOTE]->(a)
WHERE f.year = e.year AND e.score >= 8 AND f.score >= 8
RETURN a.name,b.name,e.year,e.score,f.score
ORDER BY e.year,a.name

This result is quite detailed, we can simplify by agregating by years.

MATCH (a)-[e:VOTE]->(b), (b)-[f:VOTE]->(a)
WHERE f.year = e.year AND e.score >= 8 AND f.score >= 8
WITH a,b,collect(DISTINCT e.year) as years,count(DISTINCT e.year) as nb_years
RETURN a.name,b.name,years
ORDER BY nb_years DESC

It’s interesting to see that there is actually countries that are mutualing voting to each other, and for some, it appears 2 times on 4 years.

Graph of Friends

Let’s visualize the last result

MATCH w WHERE w.winner = "Yes" REMOVE w:Country SET w:Winner;
MATCH p = (a)-[e:VOTE]->(b), q = (b)-[f:VOTE]->(a)
WHERE f.year = e.year AND e.score >= 8 AND f.score >= 8

RETURN p

Interesting, some winner sounds like part of a cluster of countries. But as they are winner, they automatically behave like 'hub' in the graph. Let’s do the same graph without them.

MATCH p = (a)-[e:VOTE]->(b), q = (b)-[f:VOTE]->(a)
WHERE f.year = e.year AND e.score >= 8 AND f.score >= 8 AND a.winner <> "Yes" AND b.winner <> "Yes"

RETURN p

With or without the winner candidates, the result is same : it almost looks like the World Border Graph. So it sounds like some countries are part of a "geographical gang".

Conclusion

For sure, the investigation can be improved by more data and analysed more in details.

We can’t conclude surely that countries are organised in "gang", but we can observe some interesting singularities and distinguish some cluster of country based on geographical location.