Pipeline success rate
Reduction in manual reprocessing
Data error rate after validation
The client’s data environment was complex, with separate teams maintaining their own definitions, calculations, and reporting logic. This resulted in inconsistent metrics, duplicate transformation efforts, and frequent disputes over “which numbers were correct.”
I led a cross-functional project to design and deliver a unified data model that aligned on business logic across analytics, finance, marketing, and operations teams. This model, built in BigQuery and orchestrated with Airflow and dbt, served as the gold layer, the single source of truth for all critical reporting and analytics.
To ensure accuracy and trust, we implemented a robust data validation framework that ran automated checks at every stage of the pipeline. Using SQL tests, Datafold, and GCP-native monitoring, the system detected anomalies, schema changes, and logic mismatches before they could reach production.
A critical part of the work was stakeholder collaboration, holding workshops with each department to validate the unified business logic, fact-check the transformed data against source systems, and gain full buy-in on the definitions.