Using the Neo4j graph database doesn’t have to be opposed to your existing Oracle RDBMS infrastructure. In fact, the two can work together.

One of many ways Neo4j works alongside Oracle RDBMS is to have all data fully synchronized between the two database technologies.

In this Neo4j and Oracle blog series, we’ll explore how these two database technologies work together in tandem to deliver the best bottom-line results for both enterprise architects and business teams alike. In previous weeks, we defined and introduced both Neo4j and Oracle RDBMS; we covered 3 advantages of using Neo4j with Oracle; and we explored how to migrate or sync a subset of your data between them.

This week, we’ll discuss the advantages of fully synchronizing the data between your Oracle RDBMS and Neo4j, including an example of how Monsanto (a customer of both Oracle and Neo4j) fully syncs their data between the two.

Why Sync Data between Oracle RDBMS and Neo4j


Applications that integrate data from multiple data sources are a common use case for full synchronization. Another use case for a full synchronization arises when you have an existing set of applications writing to an Oracle database and changing those applications is cost prohibitive. For the data to add increasing value, new technologies need to be introduced where the Oracle RDBMS is incapable. This was the case for Monsanto.

Case Study: How Monsanto Synced Neo4j and Oracle Exadata


Monsanto is a multinational leader in agrochemical and agricultural biotechnology. Prior to adopting Neo4j, Monsanto relied on a 96-CPU Oracle Exadata installation to host its core genetic ancestry data with plenty of stored procedures, JOIN tables, recursive queries and dual indexes to optimize performance.

The Monsanto team was well-versed in Oracle tuning and optimization, with over 30 years’ experience of tuning Oracle RDBMS between them all. However, the Exadata instance regularly failed to process genetic ancestry data in real time – a prerequisite if the team was to use a new genomic testing technique that could take a full year off of its time-to-market cycle.

The team’s first attempt to generate real-time results was to build and parse gigantic in-memory graphs. But once a query was complete, the graphs disappeared. The team looked for a way to persist graph data over the long term and found Neo4j.

Within one day of discovering Neo4j, the team built a prototype with a small dataset. A month later, the team had the entire genetic ancestry dataset in the graph database for a beta-release application. However, even with the Neo4j deployment in full production, dozens of applications continued to read and write data to the Exadata environment. Instead of turning off these database connections all at once, the team built a custom API layer to sync the stream of information to and from Exadata with Neo4j.

The team then introduced a valuable new query interface where their data scientists could execute deeply connected queries in a simple, keyword-driven way that wasn’t previously possible with SQL algorithms. The architecture uses Apache Kafka as a distributed commit log to feed Neo4j with live transactional data from Oracle; the team built an Oracle GoldenGate and Kafka connector, which they open sourced and made available on GitHub.

Learn how and why to fully synchronize your relational and graph data between Oracle RDBMS and Neo4j

Using Oracle and Neo4j for Ancestry-as-a-Service at Monsanto

Today, the genetic ancestry solution services approximately 120 different applications and data scientists, handling over 600 million REST requests across approximately a billion nodes, and results are returned in just tenths of milliseconds.

Performance comparison between Neo4j and Oracle RDBMS with SQL

Finding all of a plant’s ancestors at Monsanto: Query times for Oracle Exadata versus Neo4j

Conclusion


Next week, we’ll look at the final way to use Neo4j alongside Oracle RDBMS: a full data migration from one to the other with several customer case studies with the specifics of each migration.


Download this white paper, How Neo4j Co-exists with Oracle RDBMS and discover how Neo4j + Oracle helps you uncover rich data connections, reduce time to market and run more efficient queries.

Read the White Paper

 

Keywords:  


About the Author

Gabe Stanek & Stefan Kolmar, Neo4j Field Engineering Team

Gabe Stanek & Stefan Kolmar Image

Gabe Stanek is VP of the Global Field Engineering organization on the Neo4j team. He has spent his entire career focused on helping people and companies achieve success with technology and realizing value out of technical investments. At Neo4j, he enjoys the opportunity to help guide the team committed to this same success.

Stefan Kolmar is the Director of Field Engineering for the EMEA region. He has been in Technical Pre-sales roles for more than 15 years, working for companies such as Tandem, Compaq and Portal Software with specific emphasis on database technologies. More recently, he took on lead Pre-Sales and Consulting roles in the DACH region for TimesTen (in-memory relational database). After TimesTen was acquired by Oracle (2006), Stefan grew to lead the entire European Oracle ISV/OEM Pre-Sales team as the Director of Sales Consulting.


1 Comment

Barry says:

There is only GoldenGate on GitHub https://github.com/MonsantoCo/goldengate-kafka-adapter .
Where can I find Kafka connector which push data into neo4j.
Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Upcoming Event

 

Neo4j GraphTour Register Now

From the CEO

Emil's Blog


Have a Graph Question?

Stackoverflow
Slack
Contact Us

Share your Graph Story?

Email us: content@neotechnology.com


Popular Graph Topics