
In case you didn’t notice already, graph databases like Neo4j are hot nowadays. People ask questions, write about them, also in the contexts of NOSQL and RDF. Recently Twitter open sourced their graphdb implementation, targeted at shallow, distributed graphs. And then Facebook revealed their new Graph API using the Open Graph Protocol.
Today, we’re going to show you how easy it is to use the Facebook Graph API to mash up data from Facebook with data in a locally hosted graph database!
It’s movie time!
Let’s say you want to see a movie with one of your friends. Wouldn’t it be neat with a service that uses the Facebook social graph
to collect movies your friend liked, and combines this with IMDB data to produce a movie suggestion? Turns out that an app like that is pretty straight forward with a graph database.
The first step is to connect to Facebook to fetch a list of your friends, so that’s where the app will start out:
Next a list of your friends will show up:
Now, just click one of your friends and a movie suggestion will be generated:
Under the Hood
What we need to do is simply to let our mashup talk to both the Facebook Graph API and the IMDB API. Uh-oh – IMDB doesn’t have a public API that you can throw requests at. Well, that’s simple enough: we’ll just import the data into a local Neo4j graph database and then access it through the Facebook Graph API!
So, let’s see how to solve this. Here’s the basic structure of our app:
MovieNight.js
is the mashup itself, embedded in the web page. It uses the Facebook Graph API to get information about the friends of the visitor and the movies that your friends like. SuggestionEngine.js
uses the Graph API to talk to a Neo4j database containing movie information (a small example data set from IMDB). The movie suggestion is based on what movies your friend has liked in the past. It simply tries to find other movies starring some actor from the liked ones.
Using the same Graph API to connect to both Facebook and the Neo4j graph database backend makes for convenience: it means that you can use tools written for Facebook for locally hosted data as well – and that’s what we’re doing here. To download the source, go to the download page.
Facebook data
To get your friends from Facebook, just use the common Facebook graph API:
FB.api('/me/friends', function(response) { friends = response.data; // Load friends into UI friend_list.empty(); for ( var i = 0; i < friends.length; i ++ ) { add_friend( friends[i] ); // write to UI } });
Getting the movies a friend likes is very similar to getting the friends list:
FB.api("/" + friend.id + "/movies", function(result) { /* handle the response here */ }
For more information, see the Graph API documentation.
Neo4j data
To connect to the Neo4j graph server we had to hack the connect-js library slightly, as it’s hard coded to send requests to facebook.com. What we added is the possibility to add prefixes for different data sources. It still defaults to graph.facebook.com etc., but makes a “fb:” prefix available to make your code easier to read. To hook in a data source, we modify the FB.init() call like this:
FB.init({ appId : '', // NOTE: create an appid and add it here status : true, cookie : true, xfbml : true, // time to add our IMDB backend to the mix external_domains : { imdb : 'https://localhost:4567/' } });
Now we’re able to send reqests to our own server as well, using code similar to the following:
FB.api("imdb:/path/to/data/in/graph", function(data) { // data is available here :) });
So now that we can send requests, what can we do with the Neo4j backend here? Here’s a comprehensive list showing precisely that in some detail (all requests are GET
from https://localhost:4567
):
Get Actor (or Movie) by Id | |
---|---|
Request | Response |
/56 |
{ "name": "Bacon, Kevin", "id": 56 } |
Extended information about Actor(/Movie) | |
Request | Response |
/56?metadata=1 |
{ "name": "Bacon, Kevin", "id": 56, "metadata": { "connections": "https://localhost:4567/56/acted_in" }, "type": "actor" } |
All the Movies an Actor had a Role in | |
Request | Response |
/56/acted_in |
{ "data": [ { "id": 57, "title": "Woodsman, The (2004)" }, { "id": 59, "title": "Wild Things (1998)" } // tons of movies here ... ] } |
Get (Actor or) Movie by Id | |
Request | Response |
/59 |
{ "title": "Wild Things (1998)", "year": "1998", "id": 59 } |
Extended information about (Actor/)Movie | |
Request | Response |
/59?metadata=1 |
{ "title": "Wild Things (1998)", "year": "1998", "id": 59, "metadata": { "connections": "https://localhost:4567/59/actors" }, "type": "movie" } |
All the Actors that have a Role in this Movie | |
Request | Response |
/59/actors |
{ "data": [ { "id": 56, "name": "Bacon, Kevin" }, { "id": 528, "name": "Dillon, Matt (I)" } // loads of actors here ... ] } |
Search for Actors with “bacon” in their name | |
Request | Response |
/search?q=bacon&type=actor |
[ { "name": "Bacon, Kevin", "id": 56 }, { "name": "Bacon, Travis", "id": 14242 } // more bacons here ... ] |
Search for Movies with “wild things” in their title | |
Request | Response |
/search?q=wild%20things&type=movie |
[ { "title": "Wild Things (1998)", "year": "1998", "id": 59 }, { "title": "River Wild, The (1994)", "year": "1994", "id": 74 } // more wild movies here ... ] |
Ok, but how do we use this stuff then?! Well, that’s what we’re going to look into right away, to see the Facebook Graph API used from JavaScript with a Neo4j/IMDB backend. To get started, here’s how to perform a search:
self.movie_info = function( movie_name, callback ) { // The search API uses commas for AND-type searches, spaces become OR, so for // the movie names, we switch spaces out for commas. movie_name = movie_name.replace(/ /g, ","); FB.api("imdb:/search", {type:'movie', q:movie_name }, callback ); };
The request to get the movies an actor has acted in goes like this:
FB.api("imdb:/" + actor.id + "/acted_in", function( result ) { for (var i = 0; i < result.data.length; i++) { movie = result.data[i]; // do something with the movie here! } });
To get all actors in a movie, simply use the following request:
FB.api("imdb:/" + movie.id + "/actors", function(result) { for (var i = 0; i < result.data.length; i++) { actor = result.data[i]; // do something with the actor here! } });
Actually, these three different requests are all our small suggestion engine needs to fullfill it’s task. Have a look at SuggestionEngine.js
to see the full code.
How to create a Graph API service on top of Neo4j
Let’s take a closer look at the movie backend now. It’s built using the Neo4j Ruby bindings. In our example data set we have Actors and Movies connected through Roles, here’s how these look in Ruby code:
class Movie; end
class Role
include Neo4j::RelationshipMixin
property :title, :character
end
class Actor
include Neo4j::NodeMixin
property :name
has_n(:acted_in).to(Movie).relationship(Role)
index :name, :tokenized => true
end
class Movie
include Neo4j::NodeMixin
property :title
property :year
index :title, :tokenized => true
# defines a method for traversing incoming acted_in relationships from Actor
has_n(:actors).from(Actor, :acted_in)
end
The code above is from the backend/model.rb
file. On the Neo4j level, this is the kind of structure we’ll have:
By defining indexes on Actor and Movie we can later use the find
method on the classes to perform searches.
Our next step is to expose this model over the Graph API, where we’ll use Sinatra and WEBrick to do the heavy lifting. The application is defined in the backend/neo4j_app.rb
file – we’ll dive into portions of that code right here. To begin with, how to return data for an Actor or Movie by Id?
get '/:id' do # show a node content_type 'text/javascript' node = node_by_id(params[:id]) props = external_props_for(node) props.merge! metadata_for(node) if params[:metadata] == "1" json = JSON.pretty_generate(props) json = callback_wrapper(json, params[:callback]) json end
The Sinatra route above uses a few small utility functions, let’s look into them as well. The first one is very simple, but useful if we want to extend the URIs to allow for requesting for example /{moviename}/actors
and not only numeric IDs.
def node_by_id(id) node = Neo4j.load_node(id) if id =~ /^(d+)$/ halt 404 if node.nil? node end
The next function returns the properties of a node, while filtering out those that have a name starting with a “_
” character. It also adds the node id to the result.
def external_props_for(node) ext_props = node.props.delete_if{|key, value| key =~ /^_/} ext_props[:id] = node.neo_id ext_props end
Then there’s a function that gathers metadata for a node, including a link to the list of connections to other nodes, and the type of the node.
def metadata_for(node) if node.kind_of? Actor connections = url_for(node, "acted_in") elsif node.kind_of? Movie connections = url_for(node, "actors") end metadata = { :metadata => { :connections => connections }, :type => node.class.name.downcase } end
There’s a couple more utility functions, but we’ll skip them here as they are unrelated to Neo4j.
Next up is getting the relationships from an Actor or Movie. The code will only care about valid paths, that is, paths having /acted_in
or /actors
in the end. In other cases, an empty data set is returned. Other than that, it simply delegates the work to the domain classes, by doing node.send(relationship)
to get the relationships. Using the send
method in Ruby will here equal the statements node.acted_in
or node.actors
.
get '/:id/:relation' do # show a relationship content_type 'text/javascript' node = node_by_id(params[:id]) data = [] [ :acted_in, :actors ].each do |relationship| if params[:relation] == relationship.to_s and node.respond_to? relationship data = node.send(relationship) end end data = data.map{|node| node_data(node)} json = JSON.pretty_generate({:data => data}) json = callback_wrapper(json, params[:callback]) json end
When viewing the relationships, we only want to show the most basic node info, so there’s a utility function to do that as well:
def node_data(node) data = { :id => node.neo_id } [ :name, :title ].each do |property| data.merge!({ property => node[property] }) unless node[property].nil? end data end
Performing the searches are basically handled by adding indexes to the model (see the code further above). So what’s left to do in the application is some sanity checks, delegating the search to the model and finally to format the output properly. Here goes:
get '/search' do content_type 'text/javascript' q = params[:q] type = params[:type] halt 400 unless q && type result = case type when 'actor' Actor.find(to_lucene(:name, q)) when 'movie' Movie.find(to_lucene(:title, q)) else [] end json = JSON.pretty_generate(result.map{|node| external_props_for(node)}) json = callback_wrapper(json, params[:callback]) json end
Wrap up
Here’s some major takeaways from this post:
- Graphs are going mainstream, as evidenced by initiatives like the Facebook Graph API.
- It’s often convenient to look at your data in the form of a graph, and with recent support in graph databases like Neo4j, it’s easy to use different data sources in tandem through the Graph API.
- Exposing data through the Graph API is simple if you have a graphdb backend.
And once you put your data in a graphdb, you can of course do more advanced graphy things too, like finding shortest paths, routing with A*, modeling of complex domains and whatnot. Just get started!
Example source code
To get the source code of the example, go to the download page.
Credits
Here’s the guys who wrote the code of the example:
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.