Graphgen: The Story and New Features

From the Neoxygen.io website.

This post is an overview of how Graphgen started, how it has improved, and what is planned for the next weeks.

For people who have not tried Graphgen yet, it is an online graph generation engine where you can describe your schema in a Cypher way, for example:

(p:Person *35)-[:KNOWS *n..n]->(p)
(p)-[:WORKS_AT *n..1]->(c:Company *7)

This Cypher statement will generate for you 35 Person nodes, 7 Company nodes and create the many-to-many relationships between Persons and the many-to-one relationships between persons and companies.

How It All Started

Some weeks ago, I spent a couple of days browsing the Stack Overflow Neo4j-tagged questions, answering questions and learning a lot while looking at the problems and answers from others.

One point that is redundant, and not specifically bounded to Neo4j, is the difficulty that people have in explaining their domain model, which in turn makes it difficult for respondents to help them. Resulting in quite a few “explain more about your…” mini-answers.

Until now, the best way to have a holistic visualization of the people’s domain model was to open the Neo4j Console or your local database and write manually all nodes and relationships.

I personally found it really boring and very often, a barrier to helping people.

That’s why I first created Neogen. The first implementation permitted you to describe your model (nodes and relationships and the counts of nodes and relationships types you wanted) in YAML files and by running a CLI command, the files were parsed and the schema was loaded in your local database.

Neogen also makes use of the Faker library to generate fake data for your node and relationships properties. The system worked really smoothly and can be easily integrated in testing suites.

Graphgen Is Born

After the creation of Neogen, I sent some e-mails to Michael Hunger, Alberto Perdomo and Anders Nawroth to get their input on the idea and see how it could be used to help the community.

Michael jumped on the occasion to ask about an online version based on the Cypher spec. I loved the idea and started developing a quick beta version for easy feedback. Graphgen was born.

After a bunch of exchanges with Michael, Graphgen started to have a fixed guideline and a stable implementation.

A Week After Launch

One week after the launch of Graphgen, it has reached almost 1,000 users, and I couldn’t be happier.

I’ve received incredible feedback from people engaged in the Neo4j ecosystem.

Features, Features, Features

Since the launch, a bunch of features have been implemented, including:

Export in GraphJSON & Cypher Queries
Import in a publicly accessible database or even in your local database directly from the app
Support for Multi-labels
Creation of a Neo4j console set up with your graph in order to be able to query it
More faker types created and also available as standalone components
And a new feature explained below

Nodes Types

While the current implementation was already very nice, I wanted a more user-friendly way of having nodes generated with fake properties. Since the beginning, I was think of adding a layer to the current implementation in order to have a way to describe what is a Person, what is a Company, etc …

It is now finished, and I’m happy to announce Nodes Types.

Let’s represent again the Person/Company scheme:

(person:#Person *35)-[:WORKS_AT *n..1]->(company:#Company *7)
(person)-[:KNOWS *n..n)->(person)

As you can see, I added a hashtag(#) before the label. This is the activator of the model and will use the properties defined in it.

With only two lines, you’ll get a graph containing 35 Person nodes having firstname, lastname, dateOfBirth properties and 7 Company nodes having a name and an activity description properties.

All the defined relationships will also be created.

Currently I created only 5 types : Person, User, Tweet, Hashtag, Company.

I would love to hear from the community about any further use cases of types people would like to have available, of course the types should be quite common to standard domains. You can send me your ideas by e-mail to chris (at) neoxygen (dot) io or via Twitter, you can also comment in the dedicated issue of the Graphgen repository here: https://github.com/neoxygen/graphgen/issues/2

What Else Is Planned?

The following features are planned:

Complete documentation
Twitter & Github login for saving your graphs and also creating your own model types.
Creation of a GrapheneDB instance with your account for importing your graph into it.
Creating graphs from Gist Urls
Full accessibility through the api for third party usage

I hope you love the way Graphgen goes, if you have any questions, remarks, feature requests etc… please ping me.

Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.

Download My Free Copy