Knowledge Base

How deletes work in Neo4j

Neo4j uses logical deletes to delete from the database to achieve maximum performance and scalability. To understand how this might appear to an operator of the database, lets take a simple case of loading data into Neo4j. When you start loading data, you can see the nodes are stored in a file called neostore.nodestore.db. As you keep loading, the file will keep growing.

However, once you start deleting nodes, you can verify that the file neostore.nodestore.db does not reduce in size. In fact, not only does the size remain the same, but you will also start to see the file grow - and keep growing for all records deleted.

This happens because of id re-use. Deletes in Neo4j do not physically delete the records, but rather just flip the bit from available to unavailable. We keep the deleted (but available to reuse) IDs in This means the file acts sort of like a "recycle bin" where it stores all the deleted ids.

Now you’ve deleted the data and neostore.nodestore.db is the same size as before the delete, the file is larger than before the delete operation. How do you reclaim this space?

When you start loading new data after the deletes, Neo4j starts using the ids recorded in and thus the neostore.nodestore.db file does not grow in size and the file starts decreasing until it’s completely empty.

If you do not plan to add more nodes but still want to shrink the size of the database on disk, you can use the copy store util. This utility will read an offline database, copy it to a new one, and leave out data that is no longer in use (and also the list of eligible ids to re-use).

Large deletes can generate a lot of transaction logs. You should be aware of this when doing mass delete operations otherwise - ironically - your filesystem can potentially fill up.