In politics, people are often advised to “follow the money” to understand the forces influencing decisions. As engineers, we know we can do that and more by following the data.

Inspired by some innovative work by Dave Fauth, a Washington DC data analyst, we arranged a workshop to use FEC Campaign data that had been imported into Neo4j.

FEC Campaign Finance Data

Every Sunday of every year, the FEC updates campaign finance data sets for the current two-year election period plus the most recent five (5) two-year election periods. The data sets include:
  • all individuals registered as candidates for President, House, or Senate
  • all registered committees engaged in political fundraising
  • all individual contributions greater than $200
In addition, there are extra files concerning transactions between committees and then some for associating some records (ooh look, relationships!).

After exploring some evolutionary import strategies (starting with the most direct, then iterating), we settled on an approach which structured the data to look like this:
Campaign Finance Data in a Graph

Query Challenge

With the data imported, and a basic understanding of the domain model, we then challenged people to write Cypher queries to answer the following questions:
  1. All presidential candidates for 2012
  2. Most mythical presidential candidate
  3. Top 10 Presidential candidates according to number of campaign committees
  4. Find President Barack Obama
  5. Lookup Obama by his candidate ID
  6. Find Presidential Candidate Mitt Romney
  7. Lookup Romney by his candidate ID
  8. Find the shortest path of funding between Obama and Romney
  9. List the 10 top individual contributions to Obama
  10. List the 10 top individual contributions to Romney
Care to give the challenge a try? OK, then follow the steps on the github project site to clone the importers. You’ll want to run the related importer like so:

./bin/fec2graph --force --importer=RELATED

Then just start up Neo4j and open a browser to http://localhost:7474 to query away. If you’re new to Cypher read through the Neo4j Manual Section on Cypher to learn the basics of querying a graph.

Submit the queries to me andreas@neotechnology.com by next Thursday and we’ll pick a winner from the correct entries. Prize? A free pass to GraphConnect of course! Coming this November 5 & 6 in San Francisco, GraphConnect is a fantastic conference devoted to graph databases.

Want a hint? 

Alrighty. Let’s take a look at #2. After successfully listing all candidates for the first query, you could page through the listing to look for names that seem.. just off. Use limit and skip in the return clause to page through the long listing:

start candidate=node:candidates('CAND_ID:*') 
where candidate.CAND_OFFICE='{fill this in}' AND candidate.CAND_ELECTION_YR='{this too}'
return candidate.CAND_NAME skip 100 limit 100;

Once you spot one of the many candidate names that isn’t real, you can query for it directly:
start candidate=node:candidates(CAND_NAME:'CLAUS, SANTA')
return candidate;

Cypher Masters

From our recent workshop, the winners are:
  • Matt Tyndal
  • Lou Kosak
  • Pengchao Wang
Congratulations, and thanks to everyone who joined us for the event. With the announcement of next week’s winner we will include solutions to the challenge. Good luck!

Always,
Andreas  

Keywords:  


2 Comments

Seems that this would be useful to a someone trying to learn Cypher (like me). Can I find the winning cypher queries anywhere please?

Are the winning queries posted anywhere please?

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Popular Graph Topics

Archives

Have a Graph Question?

Reach out and connect with the Neo4j staff.
Stackoverflow
Contact Us