Neo4j Life Sciences and Healthcare Network
Neo4j Use Cases in Life Sciences and Healthcare
If you work in biology, biochemistry, pharmaceuticals, healthcare and other life sciences, you know that you work with highly-connected information. Unfortunately, many scientists still use relational databases and spreadsheets as their daily tools.
Here we want to present you with an alternative. Managing, storing and querying connected information is natural to a graph database like Neo4j. Learn how your research and practitioner colleagues utilized Neo4j to draw new insights or just be more efficient in their daily work.
It started a while time ago in 2012 with a workshop at the University of Ghent bringing together people from the field with graph database experts.
Following that fruitful exchange we started the Neo4j-Biotech Google Group to encourage sharing and collaboration on that topic. If you are not yet a member, please join today.
Now we want to take it to the next level by providing you with a platform to present your projects and papers both here and on our blog, and giving you the opportunity to connect with other Neo4j users in your field.
If you are taking your first steps towards using a graph database, we offer support to jumpstart your efforts.
Graph Databases in Life and Health Sciences Workshop: Berlin, 21 June 2017
We are very pleased to announce our second workshop for researchers interested in sharing and learning about Graph Databases in Life and Health Sciences.
We are inviting researchers, practitioners and developers to present and attend.
More details, as well as registration information, can be found here.
In our past workshop in Ghent, we had topics covering
-
Neo4j in metaproteonomics
-
Graph databases in cancer research
-
Project collaboration networks and recommendations
-
Detailed studies of citation graphs
-
Connecting protein databases in a large graph model
-
"Reactome" database of human protein interaction pathways
Life Sciences and Healthcare Accelerator Program
The Neo4j Life Sciences and Healthcare Accelerator Program is designed to help researchers and practitioners in life sciences and healthcare-related sciences make sense of their data using Neo4j. Whether you are analyzing genome data, combining protein databases, investigating drug interactions or supporting practitioners with research or clinical information processing we want to help you find insights in connected (meta-)data.
If you are accepted into the program, you will receive 1-on-1 assistance from Neo4j engineers to help you with data modeling, data import, writing Cypher queries or anything else that we can to make you successful with Neo4j.
To get started just tell us about your project and how you think we might be able to help you.
Featured Projects
The Hetnet Awakens - Understanding Disease Through Data Integration and Open Science
Daniel Himmelstein
Daniel Himmelstein’s Thesis Seminar for his PhD in Biological & Medical Informatics at UCSF.
Here are the slides and an online adaptation of the PhD Exhibit. Daniel was also interviewed on our Graphistania Podcast and created a fun Graph Gist as live documentation.
Proteomics and Graph Databases, the symbiosis of associations
Alejandro Brenes Murillo
The proteome is the entire set of proteins that are produced or modified by an organism. It is an element that varies with time, stress, environmental conditions or distinct requirements that a cell might have. Join this talk by Alejandro Brenes Murillo to see how graph databases can be useful for proteome analysis.
At the Lamond Lab in the University of Dundee, scientists are interested in modelling and understanding protein behaviour under different conditions and dimensions of analysis.
In order to achieve this goal, they use graph databases to integrate and model the proteomics data, and study its effect on a specific proteome. The dimensions of analysis are multiple, yet be it turnover, localisation, cell cycle, protein complexes or biological response to stimuli, discovering the behaviour of proteins is key to understanding how organisms function, and how disease affects them.
Big Data in Genomics: How Neo4j helps to develop new drugs
Martin Preusse
Biomedical research generates vast amounts of data. New experimental technologies like DNA sequencing, metabolomics and proteomics drive the fast growth of available information and lead to a better understanding of the molecular organization of life.
But with big data comes a big question: How do we transform unstructured data into actionable knowledge? In the case of biomedical research, the key problem is to integrate the large pile of highly heterogenous data and use it for personalized therapies and drug development. Graph databases are an ideal way to represent biomedical knowledge and offer the necessary flexibility to keep up with scientific progress. A well-designed data model and Cypher queries can deliver in seconds what previously took days of manual analysis.
Building a Repository of Biomedical Ontologies with Neo4j
Simon Jupp
In this lightning talk from GraphConnect Europe 2016, Simon Jupp of the European Bioinformatics Institute discusses the application they built to track ontologies. He also discusses why they chose Neo4j over various RDF and semantic web technologies, and provides some example queries.
Data Management in Systems Biology & Medicine
Irina Balaur, EISBM
An Integrative Framework for Data Management in Systems Biology and Medicine: Strategies for personalised medicine involve integration of large amounts of biomedical data, specific to multiple spatial and temporal scales, (including molecular data and patient clinical data). We have been developing a graph-database approach implemented in Neo4j to facilitate management (integration, exploration, visualisation, interpretation) of diverse types of biological and biomedical data.
Graphs Are Feeding The World
Tim Williamson, Data Scientist, Monsanto
Presentation at GraphConnect SF 2015.
Graph Databases in Life Sciences: Bringing Biology Back to Its Nature
Thilo Muth
Today’s life science research is about genes, proteins, metabolites, relationships, interactions and biological networks. Data storing and mining brings a huge potential for biologists, however classical storage formats such as SQL and Excel involve various issues, such as scalability and performance problems with data growth, complexity and accessibility. Finally, most of the storage models are far from biological reality: Graph databases and Neo4j meet the need in life sciences for an appropriate data and database model.
Open Tree Of Life
The tree of life links all biodiversity through a shared evolutionary history. This project will produce the first online, comprehensive first-draft tree of all 1.8 million named species, accessible to both the public and scientific communities.
Assembly of the tree will incorporate previously-published results, with strong collaborations between computational and empirical biologists to develop, test and improve methods of data synthesis.
This initial tree of life will not be static; instead, we will develop tools for scientists to update and revise the tree as new data come in. Early release of the tree and tools will motivate data sharing and facilitate ongoing synthesis of knowledge.
Biological research of all kinds, including studies of ecological health, environmental change, and human disease, increasingly depends on knowing how species are related to each other.
Yet there is no single resource that unites knowledge of the tree of life. Instead, only small parts of the tree are individually available, generally as printed figures in journal articles.
This project will provide the global community of scientists who study the tree of life with a means to share and combine their results, and will enable large-scale studies of Earth’s biodiversity. It will also create a resource where students, educators and citizens can go to explore and learn about life’s evolutionary history.
Read more on the OpenTreeOfLife Blog
0606 - Open Tree of Life and Neo4j from Neo Technology on Vimeo.
Publications
Title | Year | Authors | Affiliation |
---|---|---|---|
The Proteins API: accessing key integrated protein and genome information |
2017 |
A. Nightingale, R. Antunes, E. Alpi, B. Bursteinas, L. Gonzales, W. Liu, J. Luo, G. Qi, E. Turner, and M. Martin |
EMBL-EBI, Wellcome Genome Campus, UK |
2016 |
R. Bruskiewich, K. Huellas-Bruskiewicz, F. Ahmed, R. Kaliyaperumal, M. Thompson, E. Schultes, K. M. Hettne, A. I. Su, and B. M. Good |
Department of Human Genetics, Leiden University Medical Center, The Netherlands |
|
Recon2Neo4j: Applying graph database technologies for managing comprehensive genome-scale networks |
2016 |
I. Balaur, A. Mazein, M. Saqi, A. Lysenko, C. J. Rawlings, and C. Auffray |
European Institute for Systems Biology and Medicine (EISBM), France |
STON: exploring biological pathways using the SBGN standard and graph databases |
2016 |
V. Touré, A. Mazein, D. Waltemath, I. Balaur, M. Saqi, R. Henkel, J. Pellet, and C. Auffray |
European Institute for Systems Biology and Medicine (EISBM), France |
2016 |
M. Preusse, F. J. Theis, and N. S. Mueller |
Institute of Computational Biology, Helmholtz Zentrum München, Germany |
|
An Integrated Data Driven Approach to Drug Repositioning Using Gene-Disease Associations |
2016 |
J. Mullen, S. J. Cockell, P. Woollard, and A. Wipat |
Newcastle University, United Kingdom |
HitWalker2: visual analytics for precision medicine and beyond |
2016 |
D. Bottomly, S. K. McWeeney, and B. Wilmot |
Knight Cancer Institute, Oregon Health and Science University, USA |
2016 |
X. Dai, J. Li, T. Liu, and P. X. Zhao |
Plant Biology Division, The Samuel Roberts Noble Foundation, USA |
|
Representing and querying disease networks using graph databases |
2016 |
A. Lysenko, I. A. Roznovăţ, M. Saqi, A. Mazein, C. J. Rawlings, and C. Auffray |
European Institute for Systems Biology and Medicine (EISBM), France |
PanTools: representation, storage and exploration of pan-genomic data |
2016 |
S. Sheikhizadeh, M. E. Schranz, M. Akdel, D. de Ridder, and S. Smit |
Bioinformatics Group, Wageningen University, The Netherlands |
2016 |
I. Balaur, M. Saqi, A. Barat, A. Lysenko, A. Mazein, C. J. Rawlings, H. J. Ruskin, and C. Auffray |
European Institute for Systems Biology and Medicine (EISBM), France |
|
2015 |
G. Summer, T. Kelder, K. Ono, M. Radonjic, S. Heymans, and B. Demchak |
Center for Heart Failure Research (CARIM), University Hospital Maastricht, The Netherlands |
|
Towards Implementing Semantic Literature-Based Discovery with a Graph Database |
2015 |
D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch |
Faculty of Medicine, University of Ljubljana, Slovenia |
2015 |
D. Hoksza and J. Jelinek |
Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic |
|
2015 |
T. Muth, A. Behne, R. Heyer, F. Kohrs, D. Benndorf, M. Hoffmann, M. Lehtevä, U. Reichl, L. Martens, and E. Rapp |
Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany |
|
SimiRa: A tool to identify coregulation between microRNAs and RNA-binding proteins |
2015 |
M. Preusse, C. Marr, S. Saunders, D. Maticzka, H. Lickert, R. Backofen, and F. Theis |
Helmholtz Zentrum München, Institute of Computational Biology, Germany |
Constructing a Graph Database for Semantic Literature-Based Discovery |
2015 |
D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch |
Faculty of Medicine, University of Ljubljana, Slovenia |
A systems biology approach toward understanding seed composition in soybean |
2015 |
L. Li, M. Hur, J. Y. Lee, W. Zhou, Z. Song, N. Ransom, C. Y. Demirkale, D. Nettleton, M. Westgate, Z. Arendsee, V. Iyer, J. Shanks, B. Nikolau, and E. S. Wurtele |
Department of Genetics, Development and Cell Biology, Iowa State University, USA |
Combining computational models, semantic annotations and simulation experiments in a graph database |
2015 |
R. Henkel, O. Wolkenhauer, and D. Waltemath |
Department of Computer Science, University of Rostock, Germany |
An alternative database approach for management of SNOMED CT and improved patient data queries |
2015 |
W. S. Campbell, J. Pedersen, J. C. McClay, P. Rao, D. Bastola, and J. R. Campbell |
University of Nebraska Medical Center, Department of Pathology and Microbiology, US |
2014 |
D. Johnson, A. J. Connor, S. McKeever, Z. Wang, T. S. Deisboeck, T. Quaiser, and E. Shochat |
Department of Computing, Imperial College London, London, UK |
|
Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets |
2014 |
J. H. Poelen, J. D. Simons, and C. J. Mungall |
Center for Coastal Studies Natural Resource Center, USA |
2013 |
Christian Theil Have and Lars Juhl Jensen |
Department of Metabolic Genetics, University of Copenhagen, Denmark |