National Mortgage Association Slashes Data Lineage Complexity with Neo4j


5M+

Nodes


12M+

Relationships

Financial institutions face increasing regulatory pressure to understand and document their data pipelines. National Mortgage Association knows this challenge well. As a provider of liquidity to mortgage markets, the association must maintain absolute clarity on data movement and transformation.

The association plays a crucial role in the country’s housing market and generates enormous volumes of data, from loan application details, property appraisals, borrower financial information, to payment histories and market analyses. These data elements flow between internal and external systems, creating complex relationships that must be governed.

The association needed a system that could show exactly how this data moved through their organization and to partners. When a data element changed in their database, they needed to see how this data was used downstream. This visibility was essential for the CFO to confidently sign off on financial reports that directly impact the country’s housing market.

Data lineage failures create substantial regulatory risk. When analysts can’t determine how data transforms through systems, compliance reporting becomes unreliable and financial institutions face potential penalties.

Traditional documentation methods proved inadequate. Static documents and diagrams quickly became outdated, and relational databases couldn’t efficiently model the complex web of interconnections between data elements, systems, and processes.

Selecting Neo4j as the Foundation for Data Lineage

The association’s initial metadata repository system had operated for a number of years on a Neo4j graph database, but remained limited in its impact. Though the existing solution captured basic metadata relationships, it lacked the data quality metrics, operational context, and coverage needed to gain organization-wide trust. The system’s limited scope and value kept it siloed within a small technical team.

The critical turning point came when the CFO mandated enhanced financial reporting requirements and transparency. This executive-level initiative created the momentum and resources needed to transform the existing framework into something truly valuable. 

Rather than replacing the existing metadata system, the association partnered with a consultancy to upgrade its existing Neo4j foundation and integrate additional data sources. 

National Mortgage Association evaluated two paths forward as part of this upgrade: enhance their existing Neo4j implementation or migrate to a competing graph solution. The evaluation focused on query performance, ease of use, and total cost of ownership. Neo4j ultimately won for three decisive reasons:

  • Cypher query language proved intuitive for the association’s analysts
  • Neo4j Browser provided superior visualization capabilities for query results
  • Competing solutions required multiple additional tools to match Neo4j’s functionality

The implementation architecture used AWS infrastructure with Neo4j’s Enterprise Edition. The solution ingests data from multiple sources: static data lineage from AWS RDS and Collibra, and real-time events through AWS SQS, which Talend processes before loading into Neo4j. A three-node Neo4j Enterprise Edition cluster provides high availability across AWS availability zones, with EBS storage for durability.

Above: National Mortgage Association’s AWS Cloud Infrastructure


Transforming Data Governance with Graph Technology

National Mortgage Association has transformed its approach to data governance. Neo4j now forms the backbone of their data lineage and quality tracking, enabling transparency that was previously impossible.

The solution delivers concrete, measurable benefits:

  • Analysts can trace upstream and downstream impacts of data changes in seconds instead of hours
  • Data quality indicators now accompany lineage information, providing context for decision-making
  • Custom interfaces built on Neo4j empower both technical and business users to explore data relationships
  • Neo4j enables regulatory reporting that satisfies compliance requirements
  • The graph database contains 5.6 million nodes and 12 million relationships, tracking data elements across the enterprise

These technical achievements translate directly to business value. The CFO can now confidently sign financial reports, knowing exactly how data will flow through systems. Analysts can instantly assess the impact of proposed changes, preventing unintended consequences with multi-hop graph queries, e.g. ‘Find all data elements that originate in System A, pass through any of our mortgage processing systems, and ultimately get published, and show each transformation along the way.’ And when errors occur, teams rapidly identify root causes by following data pathways.

National Mortgage Association has constructed a comprehensive view of their data ecosystem with Neo4j. What once took days of manual tracing now happens in seconds, supporting faster decision-making and more responsive operations.

Get in Touch

Curious about what insights you could unlock for your business with graph-powered solutions? Let’s talk — reach out, and we’ll get in touch.

Partners

  • Amazon Web Services (AWS)

Industry

  • Government & Municipality

Products Used

  • Neo4j Enterprise
  • Global

Explore More