data:image/s3,"s3://crabby-images/c788d/c788dd673f4d238bb28983331a1d69f8827c436b" alt=""
In case you didn’t notice already, graph databases like Neo4j are hot nowadays. People ask questions, write about them, also in the contexts of NOSQL and RDF. Recently Twitter open sourced their graphdb implementation, targeted at shallow, distributed graphs. And then Facebook revealed their new Graph API using the Open Graph Protocol. Today, we’re going to show you how easy it is to use the Facebook Graph API to mash up data from Facebook with data in a locally hosted graph database!
It’s movie time!
Let’s say you want to see a movie with one of your friends. Wouldn’t it be neat with a service that uses the Facebook social graph to collect movies your friend liked, and combines this with IMDB data to produce a movie suggestion? Turns out that an app like that is pretty straight forward with a graph database. The first step is to connect to Facebook to fetch a list of your friends, so that’s where the app will start out:data:image/s3,"s3://crabby-images/aeefa/aeefaf78eb4f135f1d62d15d5d66dedbd3dc524f" alt=""
data:image/s3,"s3://crabby-images/623f3/623f3459513f797bf964970931dc027654beafb4" alt=""
data:image/s3,"s3://crabby-images/9e126/9e1264861eddb71ae61694d5baf1540abdc92142" alt=""
Under the Hood
What we need to do is simply to let our mashup talk to both the Facebook Graph API and the IMDB API. Uh-oh – IMDB doesn’t have a public API that you can throw requests at. Well, that’s simple enough: we’ll just import the data into a local Neo4j graph database and then access it through the Facebook Graph API! So, let’s see how to solve this. Here’s the basic structure of our app:data:image/s3,"s3://crabby-images/6e817/6e8171df25fedb3e496c89b2c84e9e489c768caa" alt=""
MovieNight.js
is the mashup itself, embedded in the web page. It uses the Facebook Graph API to get information about the friends of the visitor and the movies that your friends like. SuggestionEngine.js
uses the Graph API to talk to a Neo4j database containing movie information (a small example data set from IMDB). The movie suggestion is based on what movies your friend has liked in the past. It simply tries to find other movies starring some actor from the liked ones.
Using the same Graph API to connect to both Facebook and the Neo4j graph database backend makes for convenience: it means that you can use tools written for Facebook for locally hosted data as well – and that’s what we’re doing here. To download the source, go to the download page.
Facebook data
To get your friends from Facebook, just use the common Facebook graph API:FB.api('/me/friends', function(response) { friends = response.data; // Load friends into UI friend_list.empty(); for ( var i = 0; i < friends.length; i ++ ) { add_friend( friends[i] ); // write to UI } });Getting the movies a friend likes is very similar to getting the friends list:
FB.api("/" + friend.id + "/movies", function(result) { /* handle the response here */ }For more information, see the Graph API documentation.
Neo4j data
To connect to the Neo4j graph server we had to hack the connect-js library slightly, as it’s hard coded to send requests to facebook.com. What we added is the possibility to add prefixes for different data sources. It still defaults to graph.facebook.com etc., but makes a “fb:” prefix available to make your code easier to read. To hook in a data source, we modify the FB.init() call like this:FB.init({ appId : '', // NOTE: create an appid and add it here status : true, cookie : true, xfbml : true, // time to add our IMDB backend to the mix external_domains : { imdb : 'https://localhost:4567/' } });Now we’re able to send reqests to our own server as well, using code similar to the following:
FB.api("imdb:/path/to/data/in/graph", function(data) { // data is available here :) });So now that we can send requests, what can we do with the Neo4j backend here? Here’s a comprehensive list showing precisely that in some detail (all requests are
GET
from https://localhost:4567
):
Get Actor (or Movie) by Id | |
---|---|
Request | Response |
/56 |
{ "name": "Bacon, Kevin", "id": 56 } |
Extended information about Actor(/Movie) | |
Request | Response |
/56?metadata=1 |
{ "name": "Bacon, Kevin", "id": 56, "metadata": { "connections": "https://localhost:4567/56/acted_in" }, "type": "actor" } |
All the Movies an Actor had a Role in | |
Request | Response |
/56/acted_in |
{ "data": [ { "id": 57, "title": "Woodsman, The (2004)" }, { "id": 59, "title": "Wild Things (1998)" } // tons of movies here ... ] } |
Get (Actor or) Movie by Id | |
Request | Response |
/59 |
{ "title": "Wild Things (1998)", "year": "1998", "id": 59 } |
Extended information about (Actor/)Movie | |
Request | Response |
/59?metadata=1 |
{ "title": "Wild Things (1998)", "year": "1998", "id": 59, "metadata": { "connections": "https://localhost:4567/59/actors" }, "type": "movie" } |
All the Actors that have a Role in this Movie | |
Request | Response |
/59/actors |
{ "data": [ { "id": 56, "name": "Bacon, Kevin" }, { "id": 528, "name": "Dillon, Matt (I)" } // loads of actors here ... ] } |
Search for Actors with “bacon” in their name | |
Request | Response |
/search?q=bacon&type=actor |
[ { "name": "Bacon, Kevin", "id": 56 }, { "name": "Bacon, Travis", "id": 14242 } // more bacons here ... ] |
Search for Movies with “wild things” in their title | |
Request | Response |
/search?q=wild%20things&type=movie |
[ { "title": "Wild Things (1998)", "year": "1998", "id": 59 }, { "title": "River Wild, The (1994)", "year": "1994", "id": 74 } // more wild movies here ... ] |
self.movie_info = function( movie_name, callback ) { // The search API uses commas for AND-type searches, spaces become OR, so for // the movie names, we switch spaces out for commas. movie_name = movie_name.replace(/ /g, ","); FB.api("imdb:/search", {type:'movie', q:movie_name }, callback ); };The request to get the movies an actor has acted in goes like this:
FB.api("imdb:/" + actor.id + "/acted_in", function( result ) { for (var i = 0; i < result.data.length; i++) { movie = result.data[i]; // do something with the movie here! } });To get all actors in a movie, simply use the following request:
FB.api("imdb:/" + movie.id + "/actors", function(result) { for (var i = 0; i < result.data.length; i++) { actor = result.data[i]; // do something with the actor here! } });Actually, these three different requests are all our small suggestion engine needs to fullfill it’s task. Have a look at
SuggestionEngine.js
to see the full code.
How to create a Graph API service on top of Neo4j
Let’s take a closer look at the movie backend now. It’s built using the Neo4j Ruby bindings. In our example data set we have Actors and Movies connected through Roles, here’s how these look in Ruby code:class Movie; endclass Role include Neo4j::RelationshipMixin property :title, :character end class Actor include Neo4j::NodeMixin property :name has_n(:acted_in).to(Movie).relationship(Role) index :name, :tokenized => true end class Movie include Neo4j::NodeMixin property :title property :year index :title, :tokenized => true # defines a method for traversing incoming acted_in relationships from Actor has_n(:actors).from(Actor, :acted_in) end The code above is from the
backend/model.rb
file. On the Neo4j level, this is the kind of structure we’ll have:
data:image/s3,"s3://crabby-images/f735b/f735bb148e88e880329926af7ae039aca308f348" alt=""
find
method on the classes to perform searches.
Our next step is to expose this model over the Graph API, where we’ll use Sinatra and WEBrick to do the heavy lifting. The application is defined in the backend/neo4j_app.rb
file – we’ll dive into portions of that code right here. To begin with, how to return data for an Actor or Movie by Id?
get '/:id' do # show a node content_type 'text/javascript' node = node_by_id(params[:id]) props = external_props_for(node) props.merge! metadata_for(node) if params[:metadata] == "1" json = JSON.pretty_generate(props) json = callback_wrapper(json, params[:callback]) json endThe Sinatra route above uses a few small utility functions, let’s look into them as well. The first one is very simple, but useful if we want to extend the URIs to allow for requesting for example
/{moviename}/actors
and not only numeric IDs.
def node_by_id(id) node = Neo4j.load_node(id) if id =~ /^(d+)$/ halt 404 if node.nil? node endThe next function returns the properties of a node, while filtering out those that have a name starting with a “
_
” character. It also adds the node id to the result.
def external_props_for(node) ext_props = node.props.delete_if{|key, value| key =~ /^_/} ext_props[:id] = node.neo_id ext_props endThen there’s a function that gathers metadata for a node, including a link to the list of connections to other nodes, and the type of the node.
def metadata_for(node) if node.kind_of? Actor connections = url_for(node, "acted_in") elsif node.kind_of? Movie connections = url_for(node, "actors") end metadata = { :metadata => { :connections => connections }, :type => node.class.name.downcase } endThere’s a couple more utility functions, but we’ll skip them here as they are unrelated to Neo4j. Next up is getting the relationships from an Actor or Movie. The code will only care about valid paths, that is, paths having
/acted_in
or /actors
in the end. In other cases, an empty data set is returned. Other than that, it simply delegates the work to the domain classes, by doing node.send(relationship)
to get the relationships. Using the send
method in Ruby will here equal the statements node.acted_in
or node.actors
.
get '/:id/:relation' do # show a relationship content_type 'text/javascript' node = node_by_id(params[:id]) data = [] [ :acted_in, :actors ].each do |relationship| if params[:relation] == relationship.to_s and node.respond_to? relationship data = node.send(relationship) end end data = data.map{|node| node_data(node)} json = JSON.pretty_generate({:data => data}) json = callback_wrapper(json, params[:callback]) json endWhen viewing the relationships, we only want to show the most basic node info, so there’s a utility function to do that as well:
def node_data(node) data = { :id => node.neo_id } [ :name, :title ].each do |property| data.merge!({ property => node[property] }) unless node[property].nil? end data endPerforming the searches are basically handled by adding indexes to the model (see the code further above). So what’s left to do in the application is some sanity checks, delegating the search to the model and finally to format the output properly. Here goes:
get '/search' do content_type 'text/javascript' q = params[:q] type = params[:type] halt 400 unless q && type result = case type when 'actor' Actor.find(to_lucene(:name, q)) when 'movie' Movie.find(to_lucene(:title, q)) else [] end json = JSON.pretty_generate(result.map{|node| external_props_for(node)}) json = callback_wrapper(json, params[:callback]) json end
Wrap up
Here’s some major takeaways from this post:- Graphs are going mainstream, as evidenced by initiatives like the Facebook Graph API.
- It’s often convenient to look at your data in the form of a graph, and with recent support in graph databases like Neo4j, it’s easy to use different data sources in tandem through the Graph API.
- Exposing data through the Graph API is simple if you have a graphdb backend.