Integration Testing with Neo4j using C# [Community Post]


Learn How to Conduct Integration Tests with Neo4j Using C# and the Neo4jClientUnlike prototypical unit testing which is designed to focus on small units of isolated code, integration testing is a type of testing which is typically designed to test interactions between two or more interconnected parts of a software system.

A common area where integration testing yields high return on effort is the interaction point between an application and a database backend. This type of integration testing allows for verification of the expected behavior of queries issued to the database as well as the subsequent transformation of that dataset into domain models or other data structures in code.

While NoSQL C# projects are increasingly common, it is still the case that most enterprise applications model data using traditional SQL databases. Therefore, it can be difficult to find good examples and guidelines for building integration tests1 using NoSQL solutions.

However, whether you’re using SQL or NoSQL databases, writing repeatable integration tests are almost always dependant on leveraging database transactions as this allows for data created during integration testing to be treated as transient data that is automatically cleaned up upon completion of the test case.

Of course, having transactional capabilities is a must in almost all applications, but it is particularly useful for developers when creating integration tests.

Neo4j and the Neo4jClient


As of 2.0, Neo4j introduced the transactional HTTP endpoint which allowed for non-Java based programming language bindings to take advantage of a fully transactional graph database.

For the C# environment, there are usually two main bindings which a developer can use: Neo4jClient, and Cypher.NET. The latter was built with transactions in mind; however, as a matter of personal taste, I prefer the fluent-like capability of Neo4jClient to write Cypher queries.

Also, when I had to decide which library to use on my company’s enterprise scale project back at the start of 2014, Neo4jClient appeared to be the more active project of the two, but there was no transaction implementation for this library.

I had two choices: either choose Cypher.NET or take advantage of Neo4jClient’s open source model and implement the transaction capability on my own. I chose the latter, which helped my team and I to easily write and understand Cypher queries, while at the same time write them with the knowledge that we could write integration tests for our modules and easily regression test any changes made to queries or data handling.

Starting with Neo4jClient 1.1.0.x, you can use the transactions implementation for your own projects, be it for testing or inside the modeling of your solution2.

Neo4jClient transaction implementation introduces new interfaces which reflects the transactional API. These are as follows3:

public interface ITransactionGraphClient : IGraphClient
{
    ITransaction BeginTransaction();
    ITransaction BeginTransaction(TransactionScopeOption scopeOption);
    void EndTransaction();
}

public enum TransactionScopeOption
{
    Join, // default value
    RequiresNew,
    Suppress
}

public interface ITransaction : IDisposable
{
    void Commit();
    void Rollback();
}

Starting and managing transactions is very straightforward and utilizes many of the same conventions in typical SQL client transactions:

public void ExecuteCypher()
{
    ITransactionalGraphClient graphClient = new GraphClient(new Uri("https://path/to/your/db/data"));

    graphClient.Connect();

    using (ITransaction transaction = graphClient.BeginTransaction())
    {
        // create two nodes on different queries
        graphClient.Cypher.Create("(n:Node {id: 1})").ExecuteWithoutResults();

        graphClient.Cypher.Create("(n:Node {id: 2})").ExecuteWithoutResults();

        // commit the created nodes
        transaction.Commit();
    }
}

Instead of creating the nodes right away, Neo4j’s transaction mechanism now waits until the commit has been issued to make such data available to all readers. Besides this facet, there are two details in this code which we will explore.

The first one is that ITransaction implements IDisposable, that is, when the code flow exits the using block, the transaction must have be committed, otherwise it will automatically be rolled back (the same pattern you would encounter for typical SQL client transactions).

For example:

public void RolledbackTransaction()
{
    ITransactionalGraphClient graphClient = new GraphClient(new Uri("https://path/to/your/db/data"));

    graphClient.Connect();

    using (ITransaction transaction = graphClient.BeginTransaction())
    {
        // create two nodes on different queries
        graphClient.Cypher.Create("(n:Node {id: 1})").ExecuteWithoutResults();

        graphClient.Cypher.Create("(n:Node {id: 2})").ExecuteWithoutResults();

        // because the code does not commit before exiting the using block, it is the same as explicitly calling Rollback() on the transaction object, and the two nodes will not be created.
        // transaction.Rollback(); // behavior is the same as not calling it
    }
}

The second detail is that the graphClient object knows that when calling ExecuteWithoutResults() it is inside of a transaction. This is accomplished by using the same mechanism used by Microsoft’s System.Transactions, that is, an ambient transaction or a transaction object stored on a thread level (using a ThreadStaticAttribute). This means, while there is no explicit or implicit call to ITransactionalGraphClient.EndTransaction() or ITransaction.Close(), all of the Cypher queries executed through graphClient in the same thread will be part of the same transaction.

There are three ways in which a “transaction block” can be started by using BeginTransaction(TransactionScopeOption):
    • TransactionScopeOption.Join: This is default value when using BeginTransaction(). It instructs the client to join an existing ambient transaction if there is one, otherwise starts a new transaction.
    • TransactionScopeOption.RequiresNew: Always starts a new transaction, even if there is already another ambient transaction.
    • TransactionScopeOption.Suppress: It indicates that the following queries will not be part of the ambient transaction (if there is one).

It is important to note that even though TransactionScopeOption.Join instructs the client to join the ambient transaction, it still has to be treated as a transaction block, that is, there must be a call to Commit() before the explicit or implicit call to ITransaction.Close().

For example:

public void NestedTransactions()
{
    ITransactionalGraphClient graphClient = new GraphClient(new Uri("https://path/to/your/db/data"));

    graphClient.Connect();

    using (ITransaction transaction = graphClient.BeginTransaction())
    {
        // create two nodes on different queries
        graphClient.Cypher.Create("(n:Node {id: 1})").ExecuteWithoutResults();

        using (ITransaction nestedTransaction = graphClient.BeginTransaction()) // by default joins the ambient transaction
        {
            graphClient.Cypher.Create("(n:Node {id: 2})").ExecuteWithoutResults();

            // if there is no call to commit, then when exiting this block and calling nestedTransaction.Close() the ambient transaction will be marked as "failed" and will be rollbacked,
            // even when the parent transaction does call Commit()
            nestedTransaction.Commit();
        }

        // commit the created nodes
        transaction.Commit();
    }
}

Furthermore, when the call to nestedTransaction.Commit() occurs, there is no actual commit to Neo4j, but it only marks the nested transaction block as “successful.” The commit happens on the parent’s call to Commit().

By making use of the previously described mechanism, writing integration tests is clean and it does not interfere with previously written code, even if your program does use transactions (unless it uses the TransactionScopeOption.RequiresNew).

Let’s assume you have Entity and EntityRepository classes:

[DataContract]
public class Entity
{
    [DataMember]
    public Guid Id { get; set; }

    [DataMember]
    public string Name { get; set; }
}

public class EntityRepository
{
    private ITransactionalGraphClient _graphClient;

    public EntityRepository(ITransactionalGraphClient graphClient)
    {
        _graphClient = graphClient;
    }

    private string GetLabel()
    {
        return typeof (Entity).Name;
    }

    public Entity Add(Entity entity)
    {
        if (entity.Id == Guid.Empty)
        {
            entity.Id = Guid.NewGuid();
        }

        using (ITransaction transaction = _graphClient.BeginTransaction())
        {
            Entity createdEntity = _graphClient.Cypher
                .Create(string.Format("(e:{0} {{entity}})", GetLabel()))
                .WithParam("entity", entity)
                .Return(e => e.As())
                .Results
                .SingleOrDefault();

            transaction.Commit();

            return createdEntity;
        }
    }
}

The previous code allows an EntityRepository instance to store an Entity object into Neo4j by using transactions 4. Now imagine that you want to test the Add() method and make sure that the object is successfully stored into Neo4j. One way to do that would be to make the following test:

[Test]
public void StoreEntity()
{
    // create an entity
    Guid entityId = Guid.NewGuid();
    Entity entity = new Entity
    {
        Id = entityId,
        Name = "Test"
    };

    // save it
    EntityRepository repository = new EntityRepository(new GraphClient(new Uri("https://path/to/your/db/data")));
    Entity createdEntity = repository.Add(entity);
    Assert.IsNotNull(createdEntity);
    Assert.AreEqual("Test", createdEntity.Name);
    Assert.AreEqual(entityId, createdEntity.Id);
}

This is fine, but after running it multiple times, you will end up with a graph full of test nodes. Is there a way to not create nodes and still use Neo4j for integration tests? This is where transactions come in handy for testing:

private EntityRepository _repository;
private ITransactionalGraphClient _graphClient;

[SetUp]
public void SetupTransactionContext()
{
    _graphClient.BeginTransaction();
}

[TearDown]
public void EndTransactionContext()
{
    // end the transaction as failure
    _graphClient.EndTransaction();
}

[TestFixtureSetUp]
public void SetupTests()
{
    _graphClient = new GraphClient(new Uri("https://path/to/your/db/data"));
    _graphClient.Connect();

    _repository = new EntityRepository(_graphClient);
}

By using a fixture level repository we can make sure that the GraphClient instance being used is the one that holds the transaction open, and by using SetUp and TearDown methods, we can start and rollback transactions.

Note that the transaction used inside of the Add() method will, by default, join the ambient transaction already created inside of SetupTransactionContext() and even though it has a call to commit, the “parent” transaction does not. Therefore any created or changed data will be rolled back and the database is returned to the original state.

Implementation of the transactional client was performed with integration with Microsoft’s System.Transactions in mind. This allow systems to combine SQL transactions and Neo4j transactions, and even use them together for testing:

[SetUp]
public void SetupTransactionContext()
{
    _graphClient.BeginTransaction();
}

[TearDown]
public void EndTransactionContext()
{
    // end the transaction as failure
    _graphClient.EndTransaction();
}

[TestFixtureSetUp]
public void SetupTests()
{
    _graphClient = new GraphClient(new Uri("https://path/to/your/db/data"));
    _graphClient.Connect();

    _repository = new EntityRepository(_graphClient);
}

[Test]
public void CombinedTransaction()
{
    // start a System.Transactions scope
    using (TransactionScope scope = new TransactionScope())
    {
        // neo4j related code
        Guid entityId = Guid.NewGuid();
        Entity entity = new Entity
        {
            Id = entityId,
            Name = "Test"
        };

        Entity createdEntity = _repository.Add(entity);

        // write your SQL code here

        // do not call scope.Complete()
    }
}

Drawbacks


Not everything is perfect and I have found some issues that might be more annoying than showstoppers:

    • You need a “global” GraphClient instance. This is usually not bad idea (as the Connect() call is quite expensive), and can be easily handled by Dependency Injection, however, if your code makes an instance of this entity all around your project, then using the techniques described in this article might involve changing your codebase.

    • When you rollback created nodes or relationships, the IDs are not recycled (or at least not for a while), this means that the files used by Neo4j to keep track of the used IDs keep growing, although they are full of unused IDs, and after a while (in the scale of millions of runs), the performance of your graph database is degraded and therefore your test runs will take longer. The solution is really simple: use a special database for testing which you can restore (or delete and recreate) to a previous point when the performance becomes really bothersome.

Conclusions


Thanks to all of the great work of Totham Oddie and his team at Readify, the integration of transactions into the Neo4jClient library now allows developers to keep using the simplicity of the fluent API for Cypher, and also use the new capabilities of Neo4j in the .NET community. You read one example in this article by empowering integration tests; but it also reassures companies that might invest into a product that uses Neo4j by looking into the support that comes from the open source community.

Find the code used for the integration tests examples here.

References


1 In contrast to unit tests, integration tests usually include the storage mechanism used in your program or by your model.

2 The most recent version in NuGet already includes the transaction implementation.

3 There are other methods and properties in the interfaces declarations. However, for the purpose of simplicity I am only showing the ones I explain in this article.

4 A transaction is used here to exemplify the usage, although it is obvious that is not required for such a simple scenario.


Want in on projects like this? Click below to get your free copy of the Learning Neo4j ebook and catch up to speed with the world’s leading graph database.