Inselspital Builds a Clinical Knowledge Graph for Better Data Quality and Decision Support
A data team at University Hospital Bern uses Neo4j to map complex clinical relationships, uncover gaps in its ontology, and lay the foundation for safer, more standardized care across 1 million patient records.
1 million
Patient records spanning 10+ years of clinical data
10+
Documents per patient post-EPIC, up from 3–5 — hundreds for complex cases
1 million+
Nodes in the clinical knowledge graph

Modern hospitals do not have a shortage of data. But they often have problems retrieving captured data in a reproducible way.
Every patient encounter creates a growing web of diagnoses, medications, procedures, notes, lab results, device readings, and reports. As documentation grows, it becomes harder to keep data consistent, connect it across systems, and make sure clinicians can rely on it. That affects reporting, research, quality programs, and clinical decision support.
At Inselspital, University Hospital Bern, a data team has spent more than 15 years building an end-to-end approach that runs from data capture through coding, validation, reporting, research, and clinical support. With Neo4j, that team is using a clinical knowledge graph to make those relationships easier to understand, identify missing connections, and improve data quality from the start.
Starting With the Data
For Olga Endrich and Karen Triep, both physicians by training who now work extensively with health data, the work has long centered on one issue: data quality.
“We were really keen on getting the data right,” says Triep, Group Leader, Medical Directorate, Inselspital. “Getting the data right means supporting the clinics in documentation and developing algorithms which support them in facilitating good structured documentation and setting up reports in order to create transparency.”
That focus shapes a broad range of work at the hospital. Their team supports reimbursement and coding, builds data products for clinical operations, and develops research pipelines that use large volumes of health data to predict outcomes and identify patients at risk. Across all of it, the same issue keeps surfacing.
“It always comes back to data quality,” Triep says.
For Inselspital, that issue matters far beyond reporting. Clean, standardized data helps hospitals monitor quality, support patient safety, improve workflows, and build a stronger base for clinical decision support and research.
A Multi-Drug Resistant Organism Outbreak Sparked the Idea
The move toward a graph database started with an urgent problem.
In 2018, Inselspital faced an outbreak of a multi-drug resistant organism. The hospital needed to trace transmission paths and understand how patients, devices, and care providers might be connected. An interdisciplinary team of infection prevention specialists, data scientists, and clinical data experts began exploring graph-based visualization as part of that work.
“It was really the first time I had contact with the graph representation of the real situation of transmission paths,” says Endrich, Head of Product Line Medicine of the Insel Data Science Center.
That project changed how the team looked at connected clinical data. Once they had seen a transmission pathway modeled through relationships, the knowledge graph approach kept coming back in later discussions about ontologies, terminology mapping, and biomedical data.
As the team moved deeper into complex ontologies and biomedical data, Endrich says the limits of a SQL-only approach became clearer. A graph database offered a more natural way to represent connected clinical information.
That early outbreak work opened the door to a much broader use case. If a graph database could help make transmission chains visible, it could also help make the hospital’s wider data landscape easier to understand and improve.
Clinical Data Keeps Getting More Complex
Inselspital’s data environment is large, varied, and growing quickly.
The hospital now has about 1 million patient records in its database, with some administrative cases going back to 2002. Those records span structured and unstructured data across diagnoses, medications, discharge letters, protocols, lab results, images, pathology, histology, vitals, anesthesia records, and more.
Documentation volumes have also climbed sharply. Endrich says that before the hospital implemented the Epic electronic medical record (EMR) system two years ago, a patient case might typically involve three to five documents. Since Epic, that number has increased significantly. A more typical case may now include 10 documents, while complex hospitalizations can generate far more.
“For patients who were hospitalized for many months and having ICU treatment for many weeks, there are hundreds of documents,” she says.
At that scale, manual review is not realistic. The challenge is finding the right concepts, connecting them correctly, aligning legacy free-text data to current information systems, and making sure important details are not buried in a growing number of documents and systems.
To help with that work, the team introduced a new coding workplace four years ago with NLP and rules-based text mining. Endrich says it helps scan documents, index them, find important text, and translate relevant information into coding systems such as ICD-10 and SNOMED CT. That supports more complete coding for research cohorts or registries, and reduces the risk that important information is missed deep inside a patient record.
Why a Graph Database Made Sense
Inselspital adopted Neo4j to find a better way to model relationships across complex clinical data.
It first modeled its research cohort using the Swiss Personalized Health Network (SPHN) schema and OMOP CDM, a common data model that helps organize different domains with standard terminologies. From there, the team used Neo4j to build what Endrich describes as a “meta graph,” a map of which concepts appear in the cohort and where the underlying data lives in the clinical data warehouse.
That matters because the same concept can show up in many places. Blood pressure alone, Endrich notes, appears in more than 1,000 different objects in the database. By mapping those records to a standard concept, the team can search for blood pressure and quickly see all the places where related data is stored.
“It helps us orient ourselves in this database and use it as a knowledge storage,” she says.
The clinical knowledge graph is helping them understand how the data is structured, how concepts relate to one another, and where those relationships need work.
The Hard Part Was Modeling the Domain Well
Inselspital’s move to a graph database was not instant or easy.
The team had to introduce a new way of thinking in an environment that had long been built around SQL. An early attempt to represent an ontology in Neo4j did not work as planned, but Endrich is clear about why — the graph helped uncover challenges in how data was categorized.
“It was not a problem with Neo4j,” she says. “It was more the problem of the design of the ontology itself.”
The first version duplicated nodes and did not produce a useful representation of the dataset. Once the team rebuilt the model from scratch with a stronger ontology design, the value of the knowledge graph became much clearer. The graph database helped expose weaknesses in how the data had been structured and named in the first place.
Seeing Gaps More Clearly
For Triep, the value of the knowledge graph became most obvious when the team used it to examine what was missing.
“When things work out and you’ve got a clean mapping, you don’t really need a graph representation in order to understand that it works,” she says. “But if you want to analyze gaps and understand why things don’t work, why relations are missing, then it is absolutely valuable to very quickly find the core problems.”
In a hospital setting, the challenge is often missing relationships, inconsistent terminology, local variations, and gaps between clinical reality and the ontology meant to describe it.
Triep says the transparency created by the team’s tools helps them see “where we have problems with the ontology we develop.” That makes it easier to improve the model, improve the input data, and improve the processes that depend on both.
In practice, Neo4j gives the team a way to make structural issues visible and easy to explain across the organization.
Early Use Cases: Procedure Mapping and Medication Mapping
The team has already applied this knowledge graph approach in two especially complex areas: procedure mapping and medication mapping.
The procedure project tackles a challenge seen across healthcare systems. Diagnoses align more easily around standards such as ICD-10, but procedures are much more fragmented. Neighboring countries can use different procedure catalogs with different hierarchies, different complexity, and different levels of granularity, which makes direct comparison difficult.
To address that, the team used a graph database to map its Swiss procedure catalog (CHOP) to SNOMED CT, build semantic representations, and explore how those concepts relate to other terminologies, including the Operationen- und Prozedurenschlüssel (OPS) used in Germany.
“For me, it was impressive to see a visual representation of the procedure mapping,” Endrich says. “You start to understand how complex it is, even for one concept.”
That visibility helps on a practical level. It also helps explain why standardization is hard, where the gaps are, and how the model can improve over time.
Triep says one reason procedure mapping is so difficult is that different catalogs use different attributes. In one part of a catalog, a concept may include attributes such as products or duration of treatment. In another, those attributes may be missing or represented elsewhere. A clinical knowledge graph makes it easier to capture those relationships as a whole and show where gaps still need to be closed.
Medication mapping raised a similar challenge. Inselspital was working with an in-house product list that was not highly standardized and often included inconsistent naming conventions. Some entries were free text. Others reflected vendor names or local naming practices. The team mapped those items to ATC codes and RxNorm identifiers to see how well the list aligned with more interoperable classifications.
Putting that mapping into a graph database made it easier to see where coverage was missing, where in-house products did not map cleanly, and where Swiss market products might not be fully represented in those external systems.
The work is still developing, but it has already given the team a much clearer picture of what needs to change.
Building a Better Base for Clinical Decision Support
For Inselspital, standardization connects directly to a larger goal: better support for clinicians now and, over time, safer care for patients while supporting clinical research.
With Epic in place, the team already has an inbuilt interaction checker, physician notifications, and patient safety rules. If medication data is inconsistent or shows limited standardization, it becomes harder to run interaction checks, allergy checks, and other safeguards in a reliable way.
“This is actually why we need it,” Endrich says. “It’s not only to have a logistic view, but also to have interaction checks, allergy checks, to have all this combined in one ontology and system.”
Right now, the work is helping the team define the problem more clearly and quantify the gaps in its current medication catalog. Endrich says that is helping shape the next phase of the project. The expectation is that a stronger catalog could support broader and more reliable data-driven processes in the future.
That makes the knowledge graph part of a larger strategic effort. By helping the hospital identify structural gaps in its ontologies and terminology mappings, the graph database is helping lay the groundwork for stronger clinical decision support.
It also supports a broader goal the team sees across Europe: making data more comparable across countries, research projects, and national initiatives. A clearer semantic model gives hospitals a better way to align with external standards and take part in wider interoperability efforts.
End-to-End Thinking Across the Data Pipeline
Endrich’s view is that this work has to be end to end, from the first point of data capture through visualization, analysis, and downstream use. A knowledge graph is not an isolated tool — it’s part of an end-to-end data pipeline.
That perspective helps explain why this work matters. Neo4j is helping Inselspital do more than visualize connected data. It is helping the team strengthen the chain that links documentation, coding, terminology, quality, research, and future clinical decision support.
For hospitals working to standardize care in a more complex digital environment, that is an important step. Better data supports better decisions. Better decisions depend on understanding how information connects. At Inselspital, a clinical knowledge graph is helping the team do exactly that.