Limitations
This section lists the known limitations and implications of Neo4js role-based access control security.
Security and Indexes
As described in Indexes for search performance, Neo4j 5 supports the creation and use of indexes to improve the performance of Cypher® queries.
Note that the Neo4j security model impacts the results of queries, regardless if the indexes are used or not. When using non full-text Neo4j indexes, a Cypher query will always return the same results it would have if no index existed. This means that, if the security model causes fewer results to be returned due to restricted read access in Graph and sub-graph access control, the index will also return the same fewer results.
However, this rule is not fully obeyed by Indexes for full-text search. These specific indexes are backed by Lucene internally. It is therefore not possible to know for certain whether a security violation has affected each specific entry returned from the index. In face of this, Neo4j will return zero results from full-text indexes in case it is determined that any result might be violating the security privileges active for that query.
Since full-text indexes are not automatically used by Cypher, they do not lead to the case where the same Cypher query would return different results simply because such an index was created. Users need to explicitly call procedures to use these indexes. The problem is only that, if this behavior is not known by the user, they might expect the full-text index to return the same results that a different, but semantically similar, Cypher query does.
Example with denied properties
Consider the following example.
The database has nodes with labels :User
and :Person
, and they have properties name
and surname
.
There are indexes on both properties:
CREATE INDEX singleProp FOR (n:User) ON (n.name);
CREATE INDEX composite FOR (n:User) ON (n.name, n.surname);
CREATE FULLTEXT INDEX userNames FOR (n:User|Person) ON EACH [n.name, n.surname];
Full-text indexes support multiple labels. See Indexes for full-text search for more details on creating and using full-text indexes. |
After creating these indexes, it would appear that the latter two indexes accomplish the same thing. However, this is not completely accurate. The composite and full-text indexes behave in different ways and are focused on different use cases. A key difference is that full-text indexes are backed by Lucene, and will use the Lucene syntax for querying.
This has consequences for users restricted on the labels or properties involved in the indexes. Ideally, if the labels and properties in the index are denied, they can correctly return zero results from both native indexes and full-text indexes. However, there are borderline cases where this is not as simple.
Imagine the following nodes were added to the database:
CREATE (:User {name: 'Sandy'});
CREATE (:User {name: 'Mark', surname: 'Andy'});
CREATE (:User {name: 'Andy', surname: 'Anderson'});
CREATE (:User:Person {name: 'Mandy', surname: 'Smith'});
CREATE (:User:Person {name: 'Joe', surname: 'Andy'});
Consider denying the label :Person
:
DENY TRAVERSE ON GRAPH * NODES Person TO users
If the user runs a query that uses the native single property index on name
:
MATCH (n:User) WHERE n.name CONTAINS 'ndy' RETURN n.name
This query performs several checks:
-
Scans the index to create a stream of results of nodes with the
name
property, which leads to five results. -
Filters the results to include only nodes where
n.name CONTAINS 'ndy'
, filtering outMark
andJoe
, which leads to three results. -
Filters the results to exclude nodes that also have the denied label
:Person
, filtering outMandy
, which leads to two results.
Two results will be returned from this dataset and only one of them has the surname
property.
In order to use the native composite index on name
and surname
, the query needs to include a predicate on the surname
property as well:
MATCH (n:User)
WHERE n.name CONTAINS 'ndy' AND n.surname IS NOT NULL
RETURN n.name
This query performs several checks, which are almost identical to the single property index query:
-
Scans the index to create a stream of results of nodes with the
name
andsurname
property, which leads to four results. -
Filters the results to include only nodes where
n.name CONTAINS 'ndy'
, filtering outMark
andJoe
, which leads to two results. -
Filters the results to exclude nodes that also have the denied label
:Person
, filtering outMandy
, which leads to only one result.
Only one result was returned from the above dataset. What if this query with the full-text index was used instead:
CALL db.index.fulltext.queryNodes("userNames", "ndy") YIELD node, score
RETURN node.name
The problem now is that it is not certain whether the results provided by the index were achieved due to a match to the name
or the surname
property.
The steps taken by the query engine would be:
-
Run a Lucene query on the full-text index to produce results containing
ndy
in either property, leading to five results. -
Filter the results to exclude nodes that also have the label
:Person
, filtering outMandy
andJoe
, leading to three results.
This difference in results is caused by the OR
relationship between the two properties in the index creation.
Denying properties
Now consider denying access on properties, like the surname
property:
DENY READ {surname} ON GRAPH * TO users
For that, run the same queries again:
MATCH (n:User)
WHERE n.name CONTAINS 'ndy'
RETURN n.name
This query operates exactly as before, returning the same two results, because nothing in it relates to the denied property.
However, this is not the same for the query targeting the composite index:
MATCH (n:User)
WHERE n.name CONTAINS 'ndy' AND n.surname IS NOT NULL
RETURN n.name
Since the surname
property is denied, it will appear to always be null
and the composite index empty. Therefore, the query returns no result.
Now consider the full-text index query:
CALL db.index.fulltext.queryNodes("userNames", "ndy") YIELD node, score
RETURN node.name
The problem remains, since it is not certain whether the results provided by the index were returned due to a match on the name
or the surname
property.
Results from the surname
property now need to be excluded by the security rules, because they require that the user is unable to see any surname
properties.
However, the security model is not able to introspect the Lucene query in order to know what it will actually do, whether it works only on the allowed name
property, or also on the disallowed surname
property.
What is known is that the earlier query returned a match for Joe Andy
which should now be filtered out.
Therefore, in order to never return results the user should not be able to see, all results need to be blocked.
The steps taken by the query engine would be:
-
Determine if the full-text index includes denied properties.
-
If yes, return an empty results stream. Otherwise, it will process as described before.
In this case, the query will return zero results rather than simply returning the results Andy
and Sandy
, which might have been expected.
Security and labels
Traversing the graph with multi-labeled nodes
The general influence of access control privileges on graph traversal is described in detail in Graph and sub-graph access control. The following section will only focus on nodes due to their ability to have multiple labels. Relationships can only have one type of label and thus they do not exhibit the behavior this section aims to clarify. While this section will not mention relationships further, the general function of the traverse privilege also applies to them.
For any node that is traversable, due to GRANT TRAVERSE
or GRANT MATCH
,
the user can get information about the attached labels by calling the built-in labels()
function.
In the case of nodes with multiple labels, they can be returned to users that weren’t directly granted access to.
To give an illustrative example, imagine a graph with three nodes: one labeled :A
, another labeled :B
and one with the labels :A
and :B
.
In this case, there is a user with the role custom
defined by:
GRANT TRAVERSE ON GRAPH * NODES A TO custom
If that user were to execute
MATCH (n:A)
RETURN n, labels(n)
They would get a result with two nodes: the node that was labeled with :A
and the node with labels :A :B
.
In contrast, executing
MATCH (n:B)
RETURN n, labels(n)
This will return only the one node that has both labels: :A
and :B
.
Even though :B
did not have access to traversals, there is one node with that label accessible in the dataset due to the allow-listed label :A
that is attached to the same node.
If a user is denied to traverse on a label they will never get results from any node that has this label attached to it. Thus, the label name will never show up for them. As an example, this can be done by executing:
DENY TRAVERSE ON GRAPH * NODES B TO custom
The query
MATCH (n:A)
RETURN n, labels(n)
will now return the node only labeled with :A
, while the query
MATCH (n:B)
RETURN n, labels(n)
will now return no nodes.
The db.labels() procedure
In contrast to the normal graph traversal described in the previous section, the built-in db.labels()
procedure is not processing the data graph itself, but the security rules defined on the system graph.
That means:
-
If a label is explicitly whitelisted (granted), it will be returned by this procedure.
-
If a label is denied or isn’t explicitly allowed, it will not be returned by this procedure.
Reusing the previous example, imagine a graph with three nodes: one labeled :A
, another labeled :B
and one with the labels :A
and :B
.
In this case, there is a user with the role custom
defined by:
GRANT TRAVERSE ON GRAPH * NODES A TO custom
This means that only label :A
is explicitly allow-listed.
Thus, executing
CALL db.labels()
will only return label :A
, because that is the only label for which traversal was granted.
Security and count store operations
The rules of a security model may impact some of the database operations. This means extra security checks are necessary to incur additional data accesses, especially in the case of count store operations. These are, however, usually very fast lookups and the difference might be noticeable.
See the following security rules that use a restricted
and a free
role as an example:
GRANT TRAVERSE ON GRAPH * NODES Person TO restricted;
DENY TRAVERSE ON GRAPH * NODES Customer TO restricted;
GRANT TRAVERSE ON GRAPH * ELEMENTS * TO free;
Now, let’s look at what the database needs to do in order to execute the following query:
MATCH (n:Person)
RETURN count(n)
For both roles the execution plan will look like this:
+--------------------------+ | Operator | +--------------------------+ | +ProduceResults | | | + | +NodeCountFromCountStore | +--------------------------+
Internally, however, very different operations need to be executed. The following table illustrates the difference:
User with free role |
User with restricted role |
---|---|
The database can access the count store and retrieve the total number of nodes with the label This is a very quick operation. |
The database cannot access the count store because it must make sure that only traversable nodes with the desired label Due to the additional data accesses that the security checks need to do, this operation will be slower compared to executing the query as an unrestricted user. |
Was this page helpful?