Handling Authentication and Identity with Neo4j and Auth0


We’re heavy users of Auth0 at Neo4j, with many of our products and services using Auth0 for authentication. For example, when logging into Neo4j Aura or Neo4j Sandbox, you are greeted with a customized Auth0 login screen.

The customized Auth0 login form used by Neo4j Aura

So when it came to redeveloping Neo4j GraphAcademy, it made sense for me to use the same services. The GraphAcademy site itself is built in TypeScript, using Express.js and backed by a Neo4j Aura database for saving course information and user enrollment information.

Adding Authentication is Easy

The Auth0 service itself is pretty straightforward. I added authentication to the site within minutes using the express-openid-connect library by following the quickstart example in the Auth0 management console.

const { auth } = require('express-openid-connect');
app.use(
auth({
issuerBaseURL: 'https://YOUR_DOMAIN',
baseURL: 'https://YOUR_APPLICATION_ROOT_URL',
clientID: 'YOUR_CLIENT_ID',
secret: 'LONG_RANDOM_STRING'
})
);

Because the authentication is handled by Auth0 and JWT tokens, I didn’t necessarily need to store any information in my Neo4j instance until the user becomes active by enrolling. At the point where a user enrolls in a course, I could just decode the JWT token provided by Auth0 to get the sub (subject in JWT terms, or the unique ID of the user).

Here is an abridged version of the enrollment route handler:

import { requiresAuth } from 'express-openid-connect'

// An oidc object containing the user is added to `req` by express-openid-connect
router.post('/:course/enrol',
requiresAuth(),
async (req, res, next) => {
try {

const user = req.oidc.user

// Create the (:User)-[:HAS_ENROLMENT]->(:Enrolment)-[:FOR_COURSE]->(c) pattern in Neo4j
const enrolment = await enrolInCourse(req.params.course, user)

// Redirect the user to the first lesson
res.redirect(enrolment.next.link)
}
catch(e) {
next(e)
}
}
)

So far so good!

Multiple Identities, Oops.

As the site got more popular, some users were complaining that they couldn’t see their enrollments. Looking at the data, I could see that a number of User accounts had the same email address. Looking at the data, I spotted something curious:

neo4j$ MATCH (u:User) 
WHERE u.email = 'adam@neo4j.com'
RETURN u.email, u.sub;

| u.email │ u.sub │

| adam@neo4j.com │ google-oauth2|113046196349780988147 │

| adam@neo4j.com │ auth0|625587d086c092006f19966d │

It turns out that it is possible to register twice with the same email. If you take a look at the sub, it is formed of two parts separated by a pipe (|), the authentication method (eg. google-oauth2 for Continue with Google) and the unique user ID provided by the service.

Rightly or wrongly, when using a different authentication method, Auth0 treats these as separate entities with a different sub and makes no attempt to reconcile the profiles. That’s fair enough, but something worth noting.

So it falls upon us to handle these cases. Should we reconcile these on our end? Or should the same email with different authentication methods be treated as different people? After all, anyone could sign up with my email address and their own password. I spoke to a few people within the company on how I should manage these cases and got as many opinions back.

Identity and Access Management (IAM) in Neo4j

As a former member of the Neo4j Professional Services Team, I’ve worked on a few Identity and Access Management (IAM) and Entity Resolution (ER) projects.

A structure of Nodes and Relationships is a great way to handle complex authentication and authorisation data. In the past, I have imported complex Active Directory data into Neo4j for analysis and created complex trees of Users belonging to Groups that can have individual privileges granted or denied, and even inherited from the groups they belong to.

Identity & Access Management

There must be a simple solution to this problem, I thought.

Handling Multiple Identities

There’s a certain level of serendipity involved when working with graphs and that always make me happy I chose to build applications on top of a graph.

The data model used on GraphAcademy is pretty simple. Here is an abstraction of the User management part of the Graph. When a user enrolls in a Course, a node is created with an :Enrolment label, which provides a connection between the User and the Course they have enrolled in.

(:User)-[:HAS_ENROLMENT]->(:Enrolment)-[:FOR_COURSE]->(:Course)

The User node has a Unique ID generated by Cypher’s randomUuid() function, with the sub and email provided by Auth0. The id and sub fields both have unique constraints applied to them to ensure an identity is not duplicated, and also to enable faster lookups at query time.

MATCH (u:User {sub: $sub})-[:HAS_ENROLMENT]->(e:Enrolment)-[:FOR_COURSE]->(c:Course)
RETURN c {
.*,
completed: e:CompletedEnrolment,
modules: [ (c)-[:HAS_MODULE]->(m) | m {
.*,
completed: exists( (e)-[:COMPLETED_MODULE]->(m) ),
lessons: [ (m)-[:HAS_LESSON]->(l) | l {
.*,
completed: exists( (e)-[:COMPLETED_LESSON]->(l) ),
} ]
} ],
}

The above query gets all enrollments for the course and returns a map projection including all properties (.*) for the Course node, checks for the existence of a :CompletedEnrolment label on the node to signify that the user has completed the course, and then uses list comprehensions to get information about the modules and lessons within the course.

If you want to learn more about the techniques used in the query above, you can enroll in the Cypher Intermediate Queries course on GraphAcademy.

Alias Accounts

There are a few ways to handle this problem. You could store an array of subs on the original User node, maybe create an (:Email) node with and link both users to that? But I’m a fan of keeping it simple.

Instead, I went for the approach of creating a :HAS_ALIASrelationship between the two nodes with the same email.

Links between Google and Auth Profiles

This takes a few lines of Cypher to generate:

// Get all users ordered by their creation date
MATCH (u:User)
WITH u
ORDER BY u.createdAt ASC

// Generate a list of users grouped by their email
WITH u.email AS email, collect(u) AS users

// Where duplicate users have this email address
WHERE size(users) > 1

// Get the first created user and a list of all others
WITH head(users) AS head, tail(users) AS tail

// Merge a HAS_ALIAS relationship between the two
FOREACH (u in tail | MERGE (head)-[:HAS_ALIAS]-(u))

Utilizing the Good Old *0..1 Technique

The moment of serendipity here came when I noticed that I could just use what I like to call the *0..1 technique to unify the user’s enrollments at query time.

The Star-zero-dot-dot-one technique? I don’t say it aloud too often.

Variable-length paths are what Graphs are designed to deal with. In Cypher, you can represent a variable-length path using an asterisk (*) and lower and upper limits. For example, if I use the pattern (start)-[:NEXT*2..5]->(end) , Neo4j will expand the :NEXT relationships in an outgoing direction from 2 to 5 hops and provide one row for each end node at the end of the path.

By setting the lower bound to 0, you are including the start node in the result set. The below query will start with the user with the supplied sub property (think the current logged in user). Then, from that node or any node 1 degree away through the :HAS_ALIAS relationship, follow the :HAS_ENROLMENT relationship to find their enrollments.

So all we have to do is to adjust our pattern for course information a bit at the beginning and call it a day.

MATCH (:User {sub: $sub})-[:HAS_ALIAS*0..1]-(u)-[:HAS_ENROLMENT]->(e:Enrolment)-[:FOR_COURSE]->(c:Course)

This way, if I enroll in one course with my Twitter account and another with my Github account, they will both appear within the same result set as long as there is a relationship.

No drastic changes to the data model and only approximately 20 more characters were added to my existing query.

Email Verification

I mentioned earlier that anybody could sign up with an email address and password. When you create an account with Auth0, you can force email verification.

Verify Emails using Auth0

Once the email is verified, it is safe to create this relationship between the two users. All enrolments related to both user nodes will be immediately available to the user currently logged into the site.

Free, Hands-on Courses With Neo4j GraphAcademy

If you are interested in learning more about Neo4j, check out the Beginners learning path to learn everything you need to know to be successful with Neo4j. We also offer App Development courses for Developers and Data Scientists.

Free, Self-Paced, Hands-on Online Training


Handling Authentication and Identity with Neo4j and Auth0 was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.