Load CSV in the Real World
Load CSV is an incredibly agile and useful tool for getting datasets large and small into Neo4j. In this live-coding session, Nicole will demonstrate the process of downloading a raw .csv file from the Internet and importing it into Neo4j. This will include cleaning the .csv file, visualizing a data model, and writing the Cypher query that will import the data. This presentation is meant to make Neo4j users aware of common obstacles when dealing with real-world data in .csv format, along with best practices when using LOAD CSV.
Follow along!
We’ve pulled together everything you will need to do this on your own machine. (Note that Nicole uses a 16GB machine. If you have less RAM and particularly if you are on Windows, check out the links in the Further Reading section of this post.)
Things you need
- Download
Consumer_Complaints.csv
here. Note that your .csv file might have more rows than in the webinar; they appear to update the data regularly.
- Find the arrows tool used to conceptually model our data here.
Things you optionally need
The Cypher code for reproducing the import is located in
LOAD_CSV.cql
. A handful of example queries for asking questions of the data are located in
example_queries.cql
in the Github Repo.
Further reading:
Speaker: Nicole White, Data Scientist, Neo Technology
Nicole grew up in Kansas City, Missouri and then spent four years at LSU in Baton Rouge, Louisiana where she got a degree in economics with a minor in mathematics. She then went to the University of Texas at Austin where she got her masters degree in analytics, and it was during this time that she found Neo4j and began exploring its capabilities. When she’s not graphing all the things, she spends her time playing card games and board games.
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.
Download My Ebook