Enterprise-Level PHP and Neo4j


Photo by Previn Samuel on Unsplash

Access a host of improvements using the recently released, next-generation, major PHP Client and Driver. In this guide, you will learn all about its new features and what makes version three the perfect candidate for any size business; — including the enterprise level.

Choose to scroll over the new features diagonally… Or join me in a deep dive into the inner workings of the driver and client to unlock the full potential of Neo4j.

Out-of-the-Box Performance Improvements

While many of these improvements have been silently released in older versions, these are now thoroughly tested and working features. They are free improvements if you update to the latest version!

Connection reuse

Let’s start with the most challenging topic.

A driver will also create a connection pool when connecting to a Neo4J server. Previously, this connection pool was just a factory in disguise, continuously making new connections.

These connections and their pool have been intelligently redesigned to understand the server’s state, making it possible to pass and reuse these connections even after making and creating PHP sessions.

Note: This means drivers should be created as few times as possible as you can otherwise easily overload your system with multiple connection pools.
$driver = Driver::create($_ENV['neo4j_dsn']);

$session1 = $driver->createSession();
$session1->run('RETURN 1 AS one');

$session2 = $driver->createSession();
$session2->run('RETURN 2 AS one');

unset($session1);

$session3 = $driver->createSession();
$session3->run('RETURN 2 AS one');

This example only creates two connections in version three but will make three connections in the previous versions as it cannot reuse them.

A lot of time gets lost opening unix sockets and exchanging bolt protocol handshakes, so the more we can reuse these, the better. This real-world example of a script reaching a server far away shows this as almost 50 percent of the time is spent just creating connections.

45 percent of the time is spent creating connections in this example. Click to enlarge the image.

A standard PHP web server setup makes this extremely difficult to mitigate this. The simplicity PHP brings with separate sessions is terrific, but not so for connection reuse. Persistent connections on a lower level are possible, but maintaining the bolt protocol metadata throughout PHP sessions through one medium or another is still required.

A lot of work has already been done in this area, and you can follow the downstream process here: https://github.com/neo4j-php/Bolt/pull/115. You can bet we will post a new blog post once this is done!

Session Reuse

The client previously created new sessions for every run statement as a small but overlooked feature. This gives problems in high availability setups as you’ll lose your bookmarks.

I’m thrilled to announce sessions are now being reused when using the client; another headache is gone!

Note: Since the client creates and manages drivers, they should be used with the same care as a driver and have as few instances as possible. The client is the perfect candidate for a singleton pattern, as you can configure as many drivers as your hearts desire.

Features for Bigger Projects

You can now configure the drivers and clients in such a way that it will efficiently work in any cloud or complex setup.

Limit the Number of Connections Throughout Your System

Connections can now be limited throughout your entire application, not just your PHP Session. If you have installed the SysV extension on PHP, it will automatically use Semaphores to restrict the number of connections per user agent and its host machine.

$driver = Driver::create(
$_ENV['neo4j_dsn'],
configuration: DriverConfiguration::default()
->withUserAgent('my-amazing-organisation/my-amazing-app:1.0.0')
->withMaxPoolSize(1)
);

$session1 = $driver->createSession();

$session1->run('RETURN 1 AS one');

$session2 = $driver->createSession();

// This will timeout as it cannot acquire more then one connection.
$session2->run('RETURN 1 AS one');

Leverage DNS for High-Availability

Many High Availability setups use DNS to attach multiple IPs to the same hostname. Previously, the driver only understood the first IP and tried to connect to it. If the server did not respond, it would throw an exception!

The driver knows better now. It understands it may need to try other IPs if this one does not work, solving a significant headache for complicated setups.

Leverage Driver Prioritization for High-Availability

Similarly, you can now attach multiple drivers on the same alias on the client. A concept dubbed “graceful driver fallback.” Just like in the DNS example, the client will now try the drivers in order of priority to connect to it.

You can change the priority by using the newly added priority parameter when you add a driver to a client.

$client = ClientBuilder::create()
->withDriver('default', 'neo4j://core1.myapp.com', priority: 255)
->withDriver('default', 'neo4j://core2.myapp.com', priority: 0)
->build();

$client->run('RETURN 1 AS one');

Never Change Your Graph Only Not to Find the Updates

Riddle me this.

Previously, if you are working in a high availability setup, this can happen:

$session->run('CREATE (x:Test {name: "test"})');
// This may throw an exception as it may not find a node with name "test"
$session->run('MATCH (x:Test {name: "test"}) RETURN x')->first();

🤯 How can this happen?

The answer is: not using bookmarks.

An overlooked feature in the previous version was bookmarking, which is now automatically managed by the session object.

The short version is that you can write to a server and read from another using the auto-routed driver. This is the default and recommended driver. Without proper precautions, these updates from one server may have been propagated to another before you read from it.

Bookmarks are now automatically managed in a session to prevent this problem from happening. It signifies to the server you are communicating to that it has to wait until the changes have propagated to it.

Of course, this is not always the desired behaviour, which you can solve by creating multiple sessions for different workloads.


$session->run('CREATE (x:Test {name: "test"})');
// This query may introduce latency as it will wait for the creation of an irrelevant node to propagate to another server in the cluster.
$nodes = $session->run('MATCH (x:Othernode) RETURN x');

// It can be solved by creating another session to read from:
$newSession = $driver->createSession();
$nodes = $session->run('MATCH (x:Othernode) RETURN x');

Understanding the Vision for the Client’s Object

This article lets you understand the significant parts of the client, driver, session, and transaction and how they interact.

A part I haven’t touched on before is that the client object is essentially an extra feature on top of the official drivers’ API. This object was introduced in the previous, now deprecated, almost decade-old, and archived project, Graphaware PHP Client.

In truth, the client is now an opinionated class that takes creative liberties on how things should be accessed and done. It gracefully falls back on drivers; it only allows a single default configuration for all its drivers, sessions and transactions; and in version three: it dynamically binds transactions to itself.

Dynamically Bound Transactions (Experimental)

You can now bind multiple transactions to an alias to override if the query gets run on a session or a transaction.

// All queries on the default alias will run over a transaction
$client->bindTransaction();
$client->run('CREATE (x:Node)');

// Open a new transaction on the default alias
$client->bindTransaction();
$client->run('CREATE (x:OtherNode)');

// Commits the latest transaction on the default alias
$client->commitBoundTransaction();

// Rollback all other transactions on the default alias
$client->rollbackBoundTransaction(depth: -1);

This is a very opinionated design pattern in the spirit of the client’s original vision. We are currently debating whether or not the client should be separated from the driver in a different project, and we’d love to hear anybody’s input.

Wrapping Things Up

Version three is a significant step up from its predecessor, and I urge everyone to migrate as soon as possible.

It has significant performance improvements and better functionality. It uses actively supported PHP versions and downstream libraries.

We’d love to hear your feedback! The best way to get our attention is to either: create an issue/feature request on GitHub or email me at ghlen@nagels.tech, where you can ask specific questions for your setup or get professional consulting advice.


Enterprise-Level PHP and Neo4j was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.