Neo4j Life Sciences and Healthcare Network

Neo4j Use Cases in Life Sciences and Healthcare

If you work in biology, biochemistry, pharmaceuticals, healthcare and other life sciences, you know that you work with highly-connected information. Unfortunately, many scientists still use relational databases and spreadsheets as their daily tools.

Here we want to present you with an alternative. Managing, storing and querying connected information is natural to a graph database like Neo4j. Learn how your research and practitioner colleagues utilized Neo4j to draw new insights or just be more efficient in their daily work.

It started a while time ago in 2012 with a workshop at the University of Ghent bringing together people from the field with graph database experts.

Following that fruitful exchange we started the Neo4j-Biotech Google Group to encourage sharing and collaboration on that topic. If you are not yet a member, please join today.

Now we want to take it to the next level by providing you with a platform to present your projects and papers both here and on our blog, and giving you the opportunity to connect with other Neo4j users in your field.

If you are taking your first steps towards using a graph database, we offer support to jumpstart your efforts.

Why use a Graph Database

Graph Databases in Life and Health Sciences Workshop: Berlin, 21 June 2017

We are very pleased to announce our second workshop for researchers interested in sharing and learning about Graph Databases in Life and Health Sciences.

We are inviting researchers, practitioners and developers to present and attend.

More details, as well as registration information, can be found here.

In our past workshop in Ghent, we had topics covering

Neo4j in metaproteonomics
Graph databases in cancer research
Project collaboration networks and recommendations
Detailed studies of citation graphs
Connecting protein databases in a large graph model
"Reactome" database of human protein interaction pathways

Life Sciences and Healthcare Accelerator Program

The Neo4j Life Sciences and Healthcare Accelerator Program is designed to help researchers and practitioners in life sciences and healthcare-related sciences make sense of their data using Neo4j. Whether you are analyzing genome data, combining protein databases, investigating drug interactions or supporting practitioners with research or clinical information processing we want to help you find insights in connected (meta-)data.

If you are accepted into the program, you will receive 1-on-1 assistance from Neo4j engineers to help you with data modeling, data import, writing Cypher queries or anything else that we can to make you successful with Neo4j.

To get started just tell us about your project and how you think we might be able to help you.

Apply Here

Featured Projects

The Hetnet Awakens - Understanding Disease Through Data Integration and Open Science

Daniel Himmelstein

Daniel Himmelstein’s Thesis Seminar for his PhD in Biological & Medical Informatics at UCSF.

Here are the slides and an online adaptation of the PhD Exhibit. Daniel was also interviewed on our Graphistania Podcast and created a fun Graph Gist as live documentation.

Proteomics and Graph Databases, the symbiosis of associations

Alejandro Brenes Murillo

The proteome is the entire set of proteins that are produced or modified by an organism. It is an element that varies with time, stress, environmental conditions or distinct requirements that a cell might have. Join this talk by Alejandro Brenes Murillo to see how graph databases can be useful for proteome analysis.

At the Lamond Lab in the University of Dundee, scientists are interested in modelling and understanding protein behaviour under different conditions and dimensions of analysis.

In order to achieve this goal, they use graph databases to integrate and model the proteomics data, and study its effect on a specific proteome. The dimensions of analysis are multiple, yet be it turnover, localisation, cell cycle, protein complexes or biological response to stimuli, discovering the behaviour of proteins is key to understanding how organisms function, and how disease affects them.

Big Data in Genomics: How Neo4j helps to develop new drugs

Martin Preusse

Biomedical research generates vast amounts of data. New experimental technologies like DNA sequencing, metabolomics and proteomics drive the fast growth of available information and lead to a better understanding of the molecular organization of life.

But with big data comes a big question: How do we transform unstructured data into actionable knowledge? In the case of biomedical research, the key problem is to integrate the large pile of highly heterogenous data and use it for personalized therapies and drug development. Graph databases are an ideal way to represent biomedical knowledge and offer the necessary flexibility to keep up with scientific progress. A well-designed data model and Cypher queries can deliver in seconds what previously took days of manual analysis.

Building a Repository of Biomedical Ontologies with Neo4j

Simon Jupp

In this lightning talk from GraphConnect Europe 2016, Simon Jupp of the European Bioinformatics Institute discusses the application they built to track ontologies. He also discusses why they chose Neo4j over various RDF and semantic web technologies, and provides some example queries.

Data Management in Systems Biology & Medicine

Irina Balaur, EISBM

An Integrative Framework for Data Management in Systems Biology and Medicine: Strategies for personalised medicine involve integration of large amounts of biomedical data, specific to multiple spatial and temporal scales, (including molecular data and patient clinical data). We have been developing a graph-database approach implemented in Neo4j to facilitate management (integration, exploration, visualisation, interpretation) of diverse types of biological and biomedical data.

Graphs Are Feeding The World

Tim Williamson, Data Scientist, Monsanto

Presentation at GraphConnect SF 2015.

Graph Databases in Life Sciences: Bringing Biology Back to Its Nature

Thilo Muth

Today’s life science research is about genes, proteins, metabolites, relationships, interactions and biological networks. Data storing and mining brings a huge potential for biologists, however classical storage formats such as SQL and Excel involve various issues, such as scalability and performance problems with data growth, complexity and accessibility. Finally, most of the storage models are far from biological reality: Graph databases and Neo4j meet the need in life sciences for an appropriate data and database model.

Open Tree Of Life

The tree of life links all biodiversity through a shared evolutionary history. This project will produce the first online, comprehensive first-draft tree of all 1.8 million named species, accessible to both the public and scientific communities.

Assembly of the tree will incorporate previously-published results, with strong collaborations between computational and empirical biologists to develop, test and improve methods of data synthesis.

This initial tree of life will not be static; instead, we will develop tools for scientists to update and revise the tree as new data come in. Early release of the tree and tools will motivate data sharing and facilitate ongoing synthesis of knowledge.

Biological research of all kinds, including studies of ecological health, environmental change, and human disease, increasingly depends on knowing how species are related to each other.

Yet there is no single resource that unites knowledge of the tree of life. Instead, only small parts of the tree are individually available, generally as printed figures in journal articles.

This project will provide the global community of scientists who study the tree of life with a means to share and combine their results, and will enable large-scale studies of Earth’s biodiversity. It will also create a resource where students, educators and citizens can go to explore and learn about life’s evolutionary history.

Publications

Title	Year	Authors	Affiliation
The Proteins API: accessing key integrated protein and genome information	2017	A. Nightingale, R. Antunes, E. Alpi, B. Bursteinas, L. Gonzales, W. Liu, J. Luo, G. Qi, E. Turner, and M. Martin	EMBL-EBI, Wellcome Genome Campus, UK
Knowledge.Bio: A Web application for exploring, building and sharing webs of biomedical relationships mined from PubMed	2016	R. Bruskiewich, K. Huellas-Bruskiewicz, F. Ahmed, R. Kaliyaperumal, M. Thompson, E. Schultes, K. M. Hettne, A. I. Su, and B. M. Good	Department of Human Genetics, Leiden University Medical Center, The Netherlands
Recon2Neo4j: Applying graph database technologies for managing comprehensive genome-scale networks	2016	I. Balaur, A. Mazein, M. Saqi, A. Lysenko, C. J. Rawlings, and C. Auffray	European Institute for Systems Biology and Medicine (EISBM), France
STON: exploring biological pathways using the SBGN standard and graph databases	2016	V. Touré, A. Mazein, D. Waltemath, I. Balaur, M. Saqi, R. Henkel, J. Pellet, and C. Auffray	European Institute for Systems Biology and Medicine (EISBM), France
miTALOS v2: Analyzing Tissue Specific microRNA Function	2016	M. Preusse, F. J. Theis, and N. S. Mueller	Institute of Computational Biology, Helmholtz Zentrum München, Germany
An Integrated Data Driven Approach to Drug Repositioning Using Gene-Disease Associations	2016	J. Mullen, S. J. Cockell, P. Woollard, and A. Wipat	Newcastle University, United Kingdom
HitWalker2: visual analytics for precision medicine and beyond	2016	D. Bottomly, S. K. McWeeney, and B. Wilmot	Knight Cancer Institute, Oregon Health and Science University, USA
HRGRN: A Graph Search-Empowered Integrative Database of Arabidopsis Signaling Transduction, Metabolism and Gene Regulation Networks	2016	X. Dai, J. Li, T. Liu, and P. X. Zhao	Plant Biology Division, The Samuel Roberts Noble Foundation, USA
Representing and querying disease networks using graph databases	2016	A. Lysenko, I. A. Roznovăţ, M. Saqi, A. Mazein, C. J. Rawlings, and C. Auffray	European Institute for Systems Biology and Medicine (EISBM), France
PanTools: representation, storage and exploration of pan-genomic data	2016	S. Sheikhizadeh, M. E. Schranz, M. Akdel, D. de Ridder, and S. Smit	Bioinformatics Group, Wageningen University, The Netherlands
EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer	2016	I. Balaur, M. Saqi, A. Barat, A. Lysenko, A. Mazein, C. J. Rawlings, H. J. Ruskin, and C. Auffray	European Institute for Systems Biology and Medicine (EISBM), France
cyNeo4j: connecting Neo4j and Cytoscape	2015	G. Summer, T. Kelder, K. Ono, M. Radonjic, S. Heymans, and B. Demchak	Center for Heart Failure Research (CARIM), University Hospital Maastricht, The Netherlands
Towards Implementing Semantic Literature-Based Discovery with a Graph Database	2015	D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch	Faculty of Medicine, University of Ljubljana, Slovenia
Using Neo4j for Mining Protein Graphs: A Case Study	2015	D. Hoksza and J. Jelinek	Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic
The MetaProteomeAnalyzer: A Powerful Open-Source Software Suite for Metaproteomics Data Analysis and Interpretation	2015	T. Muth, A. Behne, R. Heyer, F. Kohrs, D. Benndorf, M. Hoffmann, M. Lehtevä, U. Reichl, L. Martens, and E. Rapp	Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
SimiRa: A tool to identify coregulation between microRNAs and RNA-binding proteins	2015	M. Preusse, C. Marr, S. Saunders, D. Maticzka, H. Lickert, R. Backofen, and F. Theis	Helmholtz Zentrum München, Institute of Computational Biology, Germany
Constructing a Graph Database for Semantic Literature-Based Discovery	2015	D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch	Faculty of Medicine, University of Ljubljana, Slovenia
A systems biology approach toward understanding seed composition in soybean	2015	L. Li, M. Hur, J. Y. Lee, W. Zhou, Z. Song, N. Ransom, C. Y. Demirkale, D. Nettleton, M. Westgate, Z. Arendsee, V. Iyer, J. Shanks, B. Nikolau, and E. S. Wurtele	Department of Genetics, Development and Cell Biology, Iowa State University, USA
Combining computational models, semantic annotations and simulation experiments in a graph database	2015	R. Henkel, O. Wolkenhauer, and D. Waltemath	Department of Computer Science, University of Rostock, Germany
An alternative database approach for management of SNOMED CT and improved patient data queries	2015	W. S. Campbell, J. Pedersen, J. C. McClay, P. Rao, D. Bastola, and J. R. Campbell	University of Nebraska Medical Center, Department of Pathology and Microbiology, US
Semantically linking in silico cancer models	2014	D. Johnson, A. J. Connor, S. McKeever, Z. Wang, T. S. Deisboeck, T. Quaiser, and E. Shochat	Department of Computing, Imperial College London, London, UK
Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets	2014	J. H. Poelen, J. D. Simons, and C. J. Mungall	Center for Coastal Studies Natural Resource Center, USA
Are graph databases ready for bioinformatics?	2013	Christian Theil Have and Lars Juhl Jensen	Department of Metabolic Genetics, University of Copenhagen, Denmark

Title

Year

Authors

Affiliation

The Proteins API: accessing key integrated protein and genome information

2017

A. Nightingale, R. Antunes, E. Alpi, B. Bursteinas, L. Gonzales, W. Liu, J. Luo, G. Qi, E. Turner, and M. Martin

EMBL-EBI, Wellcome Genome Campus, UK

Knowledge.Bio: A Web application for exploring, building and sharing webs of biomedical relationships mined from PubMed

2016

R. Bruskiewich, K. Huellas-Bruskiewicz, F. Ahmed, R. Kaliyaperumal, M. Thompson, E. Schultes, K. M. Hettne, A. I. Su, and B. M. Good

Department of Human Genetics, Leiden University Medical Center, The Netherlands

Recon2Neo4j: Applying graph database technologies for managing comprehensive genome-scale networks

2016

I. Balaur, A. Mazein, M. Saqi, A. Lysenko, C. J. Rawlings, and C. Auffray

European Institute for Systems Biology and Medicine (EISBM), France

STON: exploring biological pathways using the SBGN standard and graph databases

2016

V. Touré, A. Mazein, D. Waltemath, I. Balaur, M. Saqi, R. Henkel, J. Pellet, and C. Auffray

European Institute for Systems Biology and Medicine (EISBM), France

miTALOS v2: Analyzing Tissue Specific microRNA Function

2016

M. Preusse, F. J. Theis, and N. S. Mueller

Institute of Computational Biology, Helmholtz Zentrum München, Germany

An Integrated Data Driven Approach to Drug Repositioning Using Gene-Disease Associations

2016

J. Mullen, S. J. Cockell, P. Woollard, and A. Wipat

Newcastle University, United Kingdom

HitWalker2: visual analytics for precision medicine and beyond

2016

D. Bottomly, S. K. McWeeney, and B. Wilmot

Knight Cancer Institute, Oregon Health and Science University, USA

HRGRN: A Graph Search-Empowered Integrative Database of Arabidopsis Signaling Transduction, Metabolism and Gene Regulation Networks

2016

X. Dai, J. Li, T. Liu, and P. X. Zhao

Plant Biology Division, The Samuel Roberts Noble Foundation, USA

Representing and querying disease networks using graph databases

2016

A. Lysenko, I. A. Roznovăţ, M. Saqi, A. Mazein, C. J. Rawlings, and C. Auffray

European Institute for Systems Biology and Medicine (EISBM), France

PanTools: representation, storage and exploration of pan-genomic data

2016

S. Sheikhizadeh, M. E. Schranz, M. Akdel, D. de Ridder, and S. Smit

Bioinformatics Group, Wageningen University, The Netherlands

EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer

2016

I. Balaur, M. Saqi, A. Barat, A. Lysenko, A. Mazein, C. J. Rawlings, H. J. Ruskin, and C. Auffray

European Institute for Systems Biology and Medicine (EISBM), France

cyNeo4j: connecting Neo4j and Cytoscape

2015

G. Summer, T. Kelder, K. Ono, M. Radonjic, S. Heymans, and B. Demchak

Center for Heart Failure Research (CARIM), University Hospital Maastricht, The Netherlands

Towards Implementing Semantic Literature-Based Discovery with a Graph Database

2015

D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch

Faculty of Medicine, University of Ljubljana, Slovenia

Using Neo4j for Mining Protein Graphs: A Case Study

2015

D. Hoksza and J. Jelinek

Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic

The MetaProteomeAnalyzer: A Powerful Open-Source Software Suite for Metaproteomics Data Analysis and Interpretation

2015

T. Muth, A. Behne, R. Heyer, F. Kohrs, D. Benndorf, M. Hoffmann, M. Lehtevä, U. Reichl, L. Martens, and E. Rapp

Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany

SimiRa: A tool to identify coregulation between microRNAs and RNA-binding proteins

2015

M. Preusse, C. Marr, S. Saunders, D. Maticzka, H. Lickert, R. Backofen, and F. Theis

Helmholtz Zentrum München, Institute of Computational Biology, Germany

Constructing a Graph Database for Semantic Literature-Based Discovery

2015

D. Hristovski, A. Kastrin, D. Dinevski, and T. C. Rindflesch

Faculty of Medicine, University of Ljubljana, Slovenia

A systems biology approach toward understanding seed composition in soybean

2015

L. Li, M. Hur, J. Y. Lee, W. Zhou, Z. Song, N. Ransom, C. Y. Demirkale, D. Nettleton, M. Westgate, Z. Arendsee, V. Iyer, J. Shanks, B. Nikolau, and E. S. Wurtele

Department of Genetics, Development and Cell Biology, Iowa State University, USA

Combining computational models, semantic annotations and simulation experiments in a graph database

2015

R. Henkel, O. Wolkenhauer, and D. Waltemath

Department of Computer Science, University of Rostock, Germany

An alternative database approach for management of SNOMED CT and improved patient data queries

2015

W. S. Campbell, J. Pedersen, J. C. McClay, P. Rao, D. Bastola, and J. R. Campbell

University of Nebraska Medical Center, Department of Pathology and Microbiology, US

Semantically linking in silico cancer models

2014

D. Johnson, A. J. Connor, S. McKeever, Z. Wang, T. S. Deisboeck, T. Quaiser, and E. Shochat

Department of Computing, Imperial College London, London, UK

Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets

2014

J. H. Poelen, J. D. Simons, and C. J. Mungall

Center for Coastal Studies Natural Resource Center, USA

Are graph databases ready for bioinformatics?

2013

Christian Theil Have and Lars Juhl Jensen

Department of Metabolic Genetics, University of Copenhagen, Denmark