Tracing the World’s Food Supply from Farm to Fork (with Neo4j)

Editor’s Note: Last October at GraphConnect San Francisco, Chris Morrison and Julien Mazerolle of Trace One, delivered this presentation on how Trace One uses Neo4j to increase food supply chain transparency.

For more videos from GraphConnect SF and to register for GraphConnect Europe, check out

Chris Morrison: Trace One is an enterprise software company that produces applications for the retail and consumer packaged goods industry. We use Neo4j inside applications to power our work with large customers.

Food Safety and Fraud

Consumers get bombarded daily with news regarding the food supply chain, which has gotten extremely complex in recent years. Today, there are far more incidents of food recalls and fraud because of increasingly lengthy supply chains, which can run anywhere from five to 12 levels deep. We often think about primary manufacturers when making food choices, but most companies only have visibility to one or two layers in their supply chain.

There are a number of health safety issues related to the food that we eat:

Growing Issues in Food Safety & Supply Chain Transparency

Salmonella was recently found in tomatoes and peppers throughout the U.S. and Canada, which impacted 1,500 people and cost $250 million to recall. The E. coli found in cookie dough hospitalized 25 people, made 70 people sick and resulted in the recall of 86 million cookies at a cost of $300 million.

With the growth of the middle class and an increase in protein demand, we frequently see horsemeat being pawned off as beef. The CEO of a peanut manufacturer was put in jail for 28 years for knowingly shipping salmonella-tainted peanuts and a BBC study found that 25 percent of all shipped oregano actually contained other ingredients.

These issues are impacting the finances and health of consumers, and at Trace One, we want to go after these problems and solve them.

A Lack of Consumer Trust

Trace One has been working with the largest retailers and brands in the food industry for the last 15 years. Each year, 20,000 companies in 100 countries spend around $300 billion on our platform. Historically we’ve focused on the interaction between the retailer and the manufacturer. Moving forward with our new Transparency-One application, we’ll map the food supply chain completely from farm to fork and use that information to create a healthier and safer supply chain.

How do consumers feel about all of the health and fraud incidents taking place in the food market? Trace One recently polled 3,000 global consumers in the U.S., Canada, Western Europe and Brazil, and the results surprised us.

Only 12 and 10 percent of consumers, respectively, wholeheartedly trust the safety and quality of their food. This means that nine out of 10 of us are nervous about what we’re eating.

Consumer Trust Is Low for the Food Supply

Consumers want more information. We want to know where food is produced, how it’s produced and what its ingredients are. So while 91 percent of consumers said it was important to know where their food comes from, nearly two-thirds said they’re not provided with enough information about the food supply chain and 27 percent of don’t believe food labels.

Consumer Trust Is Important from Farm to Fork

But here’s the good news: More information helps drive consumer trust. More than one-third of the consumers we polled said that they would be willing to pay more for better information about the food they consume, whether they’re located in France, the U.S., Brazil or Spain. Whole Foods and Trader Joe’s go a long way to provide information about their food, and their businesses are thriving as a result.

The following video explains how Trace One addresses the global issue of food safety:

Transparency-One: Finding Transparency in Food Supply Chains

We recently launched Transparency-One, which is the first B2B social network for supply chain transparency. To ensure this solution is adopted industry-wide, we are working with industry influencers and large companies such as SGS, which is the world’s leading testing and auditing company for products.

Below is an image of a standard supply chain. With any end product, there is typically a first level of manufacturers that does final processing and packaging. The supply chain can then become extremely complex.

The Complexity of the Food Supply Chain

Unfortunately, at each of these steps in the supply chain, most companies do not have the visibility to identify suppliers, ingredients, countries of origin or the facilities where the products were made and whether or not those facilities were audited and certified for global food safety standards. Many companies only have visibility on that first or second level.

After a few attempts trying to map out supply chains in a relational database, we realized that it actually looked more like a graph, so we decided to take a different standard from a technology perspective. On one side, we provide the visibility to look down the supply chain, map the connections and identify the nodes. What is even more powerful is the ability to look back up the supply chain to perform a search.

One of our earliest conversations on the topic was with a company in a potential fish recall situation. They had to track a raw ingredient from a particular country during a particular timeframe and figure out if and where the product showed up in all their finished packaged goods. This potential recall situation took three weeks to analyze and required digging through 17 different systems, which is a long time when consumer safety is at stake.

But with Transparency-One, this analysis can happen in a matter of seconds. It has the potential to dramatically increase the safety of the industry and completely transform the way we look at food supply chains.

The 3V Challenge

Julien Mazerolle: When first looking at the supply chain transparency challenge, it may seem fairly straightforward. Managing a two-tier, farm to fork supply chain is easy and can be done in any SQL database. Running a query that limits the search to two levels allows you to easily find the products that contain mozzarella, for example.

A Two-Tier Supply Chain Is Easy for a SQL Database

However, real supply chains are much more complex. When you explore the pizza example further, you see that it’s made of dough, tomato sauce and mozzarella. The dough has different suppliers and ingredients, which then may each have their own suppliers and ingredients. This also applies to the tomato sauce and mozzarella cheese.

A Variable-Tier Supply Chain Requires a Graph Database

The challenge we face when we look at the food supply chain is a “V to the power of three” challenge, because you have to combine three variable numbers that can change significantly from one supply chain to another.

First, you can have a variable number of ingredients at each level. This example is for a simple pizza recipe with three ingredients, but you could have a pizza with as many as eight or nine ingredients. The second variable number is the number of levels per ingredient. There are some ingredients made from raw materials (for example, a tomato) and in that case you’ve reached the end of the supply chain. But ingredients such as tomato sauce that are made from a variety of ingredients itself is much more complex. The third variable number is the number of suppliers for each ingredient.

When you combine these three variables, you end up with billions of nodes that will be mapped for thousands of products.

Why SQL Doesn’t Cut It: Using Neo4j to Address a Food Recall Crisis

In this sample case, there is a crisis with tomatoes and our company needs to find all our manufactured products that incorporated these tomatoes.

Running this search in SQL is complex because we don’t know in which level the tomato is located. If it’s a tomato salad, it will be at the first level, while if it’s the sauce in a pizza it could be at the fifth level. The tomato could even be an intermediate product and not even explicitly included, which adds to the complexity of the query. To capture the fact that tomatoes could appear at multiple levels, a search in SQL would require a large number of JOINs.

Because this search also points to a graph, you would also need to simulate a graph in the SQL database. And finally, food supply chains are not stable, meaning that the recipes are continually changing from day to day and month to month.

If a tomato crisis emerges, suddenly thousands of users will be connecting to the SQL database to make the exact same query, which will place a huge workload on your database. Another concern is the execution time. In the food industry it’s standard to have billions; consider a large supermarket, which may have more than 200,000 products in their stores, each of which corresponds to a node. So with billions of nodes and thousands of users connecting at the same time, the execution time will be extremely slow.

This is where Neo4j comes in.

The following query can be written in Cypher. First you search for the ingredient “tomato,” and then any product that contains that ingredient and then the brands that contain those products:

A Neo4j Query Tracing Tomatoes from Farm to Fork

Cypher is a language that is perfectly suited for this type of query, and graph databases are particularly well-suited to the food supply chain. And last but not least, the database scales extremely well for more than one billion nodes. For these reasons, Trace One chose Neo4j to power the Transparency-One solution.

Private Workspaces

On top of the Neo4j database and Cypher, each member of the supply chain has a private workspace that allows a company to publish information for their customers and invite their suppliers. However, the information can also be kept private. It’s this notion of sharing information versus keeping it private that is the difference between a transparent and non-transparent supplier.

The Transparency-One Application Built with Neo4j

This module is standard regardless of where you sit in the supply chain; whether you are at the top or the bottom, you have the ability to publish information to customers and invite suppliers.

In a real supply chain, everyone from the suppliers to the brand owners are connected in the same way:

The Global Supply Chain Organized as a Network with Trace One

We can also see that the notion of levels in a supply chain is in fact non-existent. For example, mayonnaise is a level one ingredient for supplier C but a level two ingredient for supplier E, which makes ham sandwiches. With this type of data modeling, you completely erase the notion of levels, and the power of the graph database is that it is able to handle all this.

It’s important to understand that each company is at the center of its own network, just like in a social network. You are the center of your own transparency. Consider the following example in which the company is a supplier of tomato sauce:

A Company Network Outlined in Transparency-One

The big difference between a classical supplier and a tier one supplier in a graph is the notion of a tier as local. We are the tier one supplier for the tomato sauce supplier, but a tier three supplier for another product. It’s like the solar system in which you are the sun at the center of your own network, and you can manage your relationships with planets. Transparency-One allows you to build connections and exchange information with the “planets” you have a relationship with.

Another thing to keep in mind about food transparency is that it is extremely dynamic, largely because of continually-changing recipes. To address this, Trace One added a standard industryAPI — defined with CGI and SGS — on top of the graph database. This enables real-time information updates with standard ERP systems like Oracle or SAP.

The Standard API Offered by Transparency-One for Food Supply Chain Management

Now each time you change your recipe in your supply chain, it will also be updated in Transparency-One, which is crucial if you want to be able to effectively respond to any food crisis.

Transparency-One also continually checks for data freshness, which is as important as food freshness when it comes to farm-to-fork transparency. We have a data freshness dashboard which checks the data freshness of the information in the entire graph, which is reported and used to identify areas where we need to ask the suppliers to update their information.

Data Freshness Is Critical for Food Supply Chain Management

At the end, what we work towards is improving trust drivers, consumer trust and meeting consumer needs based on six tiers:

Six Drivers of Food Consumer Trust

First, we have the Quality Process, which is embedded into the solution by all the information related to the ingredients, the suppliers and the certificates. Then, we have information on ingredients and raw materials specifications; countries of origin, which have different implications because there are different regulations in different countries; information tracking; label accuracy; and standards for certification and social responsibility. All of these work together to increase consumer safety and consumer trust, making the food supply chain safer for everyone.

Inspired by Chris and Julien’s talk? Register for GraphConnect Europe on 26 April 2016 at for more industry-leading presentations and workshops on the evolving world of graph database technology.