Neo4j Online Meetup #27: Analysing Debian packages with Neo4j



We present our work towards representing Debian’s packages, including history and releases, as well as other components of the Debian environment, as Graph Database in Neo4j. The Ultimate Debian Database UDD [1] collects a variety of data around Debian and Ubuntu: Packages and sources, bugs, history of uploads, just to name a few. The database scheme [2] reveals a highly de-normalized RDB [2,3]. In this on-going work we extract (some) data from UDD and represent it as graph database. The presentation will give a short introduction on the life time and structure of Debian packages, followed with the graph database scheme (nodes and relations). After going through some of the queries used on the UDD web pages we show how they can be translated to Cypher. We close with an outlook of our future plans and open problems. [1] https://wiki.debian.org/UltimateDebianDatabase/ [2] https://udd.debian.org/schema/ [3] https://udd.debian.org/schema/udd.png