Neo4j has long been distributed with a dataset of movies – showing how movies, actors, actresses, directors and more relate to each other.

We’ve also recently added Northwind organizational data for those developers who are more business minded. Nonetheless, these datasets don’t capture the interest of all developers who work with a variety of types of data.

We wanted a dataset that everyone could feel a personal attachment to, so we decided to enable you to analyze your personal Twitter data in Neo4j!

In order to let the masses explore their Twitter data in an isolated environment, we decided to take advantage of the new Neo4j Docker image.

We set up a Neo4j Docker container for each user, running on Amazon’s Elastic Container Service (ECS).

Architecture


The Graph Your Network App Architecture using the Neo4j Docker Image


When a new user visits network.graphdemos.com, they are directed from one of the neo4j-twitter-head instances to login with their Twitter account.

This completes an OAuth 1.0a dance, enabling the Graph Your Network application to access the user’s Twitter data on their behalf. While most of the data being accessed is already public, acting on behalf of the user gives additional Twitter API quota and an ability to authenticate the user.

We then spin up a new instance, which runs the neo4j-twitter docker image, using the official Neo4j Docker image as the base.

This instance starts up Neo4j and then runs a Python script to import the user’s Twitter data into Neo4j. The credentials needed to do the import are passed into the Docker container using environment variables.

After Neo4j is started, the credentials are reset and the URL, username and password are provided to the user on a webpage. We also run some canned queries that are executed by the neo4j-twitter-head instances using py2neo calling your personal Neo4j instance.

Resource Allotment


Each instance is allocated 1/4th of a CPU core and 768MB of memory. While this is only a small amount, it is adequate for the Twitter graphs of most users.

Since the number of EC2 servers needed to host these containers can depend upon the load, we have a cron job running regularly on the head instances which increases the auto-scaling group size appropriately or terminates instances.

Imported Data & Data Model


Learn How We Built the Graph Your Network App Using the Official Neo4j Docker Image


We import the following data:
    • Your followers
    • People you follow
    • Your tweets
    • Your mentions
    • Recent tweets using your top 5 hashtags
    • Recent tweets with #GraphConnect
    • Recent tweets mentioning Neo4j
There are three separate threads running to call the Twitter API. When the threads hit Twitter API quotas, they sleep for 15 minutes. Your new tweets are imported every 30 minutes, and other data is updated every 90 minutes.

Example Queries


We provide a set of example queries on the web app and in the tutorial built in the Neo4j browser, including:
    1. Who’s mentioning you on Twitter?
    2. Who are your most influential followers?
    3. What tags do you use frequently?
    4. How many people you follow also follow you back?
    5. Who are the people tweeting about you, but who you don’t follow?
    6. What are the links from interesting retweets?
    7. Who are other people tweeting with some of your top hashtags?

Browser Guide


Some folks have wondered how we accomplished the built-in browser guide (shown below), invoked by :play twitter. This custom guide was added to a build of the Neo4j browser, which we then replaced in the neo4j-twitter docker image.

cd $NEO4J/community/browser
vim app/content/guides/twitter.jade
mvn package
cp target/neo4j-browser-2.x.x-SNAPSHOT.jar $DOCKER_REPO/neo4j-browser-2.x.x.jar

vim $DOCKER_REPO/Dockerfile
  add: ADD neo4j-browser-2.x.x.jar /var/lib/neo4j/system/lib/neo4j-browser-2.x.x.jar

The Graph Your Network App Browser Guide using the Neo4j Docker Image


What Are Your Favorite Queries?


Let us know if you discover some great queries! Share them with me on Twitter, on Slack or on the issue tracker.

Start Exploring Now!


Visit http://network.graphdemos.com/


Want to build projects like the Graph Your Network app? Click below to get your free copy of the Learning Neo4j ebook and learn to master the world’s leading graph database.

Learn Neo4j Today

 

Keywords:  


About the Author

Ryan Boyd, Developer Relations Neo4j

Ryan Boyd Image

Ryan Boyd is a speaker, author, developer and developer relations guru with a background in coding, leadership and management.


Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Upcoming Event

 


From the CEO

Emil's Blog


Have a Graph Question?

Stackoverflow
Slack
Contact Us

Share your Graph Story?

Email us: content@neotechnology.com


Popular Graph Topics