By Neo4j Staff | October 2, 2013 We’re happy to announce the results of the first GraphGist challenge.
We thought we had high expectations, but the contributions still exceeded them by far.
In this sense, everyone is a winner, and we look forward to sending out a cool Neo4j t-shirt and Graph Connect ticket or a copy of the Graph Databases book to all participants. And for the same reason, we strongly advice you to go have a look at all submissions.
Here are all the contributions:
- Holiday Resorts by Raju Rama Krishna
- Sports League by @funpluscharity
- Learning Graph by jotomo
- IKEA furniture Graph by @rvanbruggen
- Enterprise Content Management Graph by @PieterJanVA
- US Flights & Airports by @_nicolemargaret
- Chess Games and Positions by @wefreema
- Why JIRA should use Neo4J by @PieterJanVA
- Mystery Science Theater 3000 Actors and Characters by @virtualswede
- Breaking Bad characters are interested in some products, let’s see which are by @fforbeck
- Ditching Grandma – Graphy Accounting by @ShaunDaley1
- MotoGp Graph Gist by @ricshouse
- European Royalty by @frant_hartm
- Product Catalog by @funpluscharity
- A Simple Meta-Data Model by @perival
Third PrizeAt third place, we find Chess Games and Positions by Wes Freeman. He makes it all sound very simple:
The goal is to load a bunch of chess games into Neo4j for further analysis. Scores listed are Stockfish’s take on a position after a 25 move horizon (but this number can be deepened as the graph is filled out or as more processing is done). Positions can also be loaded as alternative moves (not connected to a game) based on suggestions from Stockfish. The positions are recorded as FEN, a human-readable/compressed chess board state notation.And the data model is not overly complex at all, here’s a bit of example data:
We thought GraphGists have quite much interactivity, but Wes shows how to get even more interactivity into a GraphGist. After simply listing the moves of a game, he goes on to show off some cool statistics, which reveals the blunders in a game and even suggests better moves.
Second PrizeLearning Graph by Johannes Mockenhaupt comes in at second place. Here’s his own introduction to it:
This graph is used to visualize the knowledge a person has in a certain area. … The purpose is to document acquired knowledge and to help to further educate oneself in a structured way. This is accomplished by graphing dependencies between technologies as well as resources that can be used to learn a technology and to determine possible learning paths through the graph, which show a way to learn a specific technology, by first learning the technologies, in order, which are prerequisites for the technology to be learned. The graph is meant not to be static, but updated as new connections between technologies are discovered and new knowledge is acquired.This is how the data model plays out with a tiny set of data:
The data model is easy to grasp, and at the same time, it shows the power of graphs in a prominent way. The queries are surprisingly simple — if you ever tried to do something similar using an RDBMS, you’ll appreciate the straightforwardness and elegance of the queries presented! It’s also nice to see how the data gets updated along the way. Finally, the explanations of the queries and their results binds everything together to form a pleasant read.
First PrizeThe US Flights & Airports contribution from Nicole White finished first in this challenge. Congrats Nicole!
Here’s the background:
For any airline carrier, efficiency is key: delayed or cancelled flights and long taxi times often lead to unhappy customers. Flight planning is one the most complex optimization and scheduling problems out there, requiring a deep analysis of flight and airport data.A simple proposed data model which allows complex questions to be answered. One of the strengths of a graph database. The interesting details were not in just modeling the flights but also the cancellations and delays.
Nicole stated interesting questions on top of the data model and dataset which she was going to answer using Cypher queries:
- What is the average taxi time at each airport for both departures and arrivals?
- What is the leading cause of departure delays at each airport?
- How many outbound flights were cancelled at each airport?
Or more specific questions such as:
To show just one example:
- Which flights from Los Angeles (LAX) to Chicago (ORD) were delayed for more than 10 minutes due to late arrivals?
- How does seasonality affect departure taxi times at Chicago’s O’Hare International Airport (ORD)?
- What is the standard deviation of arrival taxi times at Dallas/Fort Worth (DFW)?
Which flights from Los Angeles (LAX) to Chicago (ORD) were delayed for more than 10 minutes due to late arrivals?
This query results in:
|Flight||Delay Time Due to Late Arrival|
With her scientific approach, listing included variables and using MathJax to render the used mathematical formulas, this submission is really impressive and a worthy winner.
Our congratulations go to every participant and the winners. We are really thrilled about the results of this competition.
GraphGists evolving & The next GraphGist ChallengeDuring the challenge we improved the code behind GraphGists:
- We added support for Math formulas.
- We added Disqus integration, so there are now comments connected to each GraphGist. Please add your comments to the challenge contributions, the authors will be happy for feedback and suggestions.
- We removed the annoying headings above result tables and graphs.
- We fixed some issues and added a workaround so Chrome under Windows doesn’t crash.
- We improved the styling a bit. (It’s still very primitive though.)
We already got questions about the next GraphGist challenge. Our plan is to run the next challenge around the time Neo4j 2.0 gets released. Currently we think that will mean a closing date before Christmas. We’ll keep you posted when we know more.
Greetings from the Neo4j GraphGist Challenge gang!
Anders Nawroth, Peter Neubauer, Michael Hunger, Pernilla Lindh, Mark Needham, Kenny Bastani