enabling seamless data migration at scale

Transport for London (TfL) set out to modernise its Road User Charging system, which involved consolidating and migrating vast amounts of historical data from its legacy infrastructure. We designed a custom data migration strategy and system to efficiently transfer petabytes of structured and unstructured data, ensuring a seamless transition with parallel-run support.
tfl metro assettfl train asset
Deliverable
data migration solution
Services
data engineering and migration
Technologies and Tools
Databricks, Microsoft Fabric, Azure Data Lake, Python, SQL
TfL’s road user charging system: Consolidating petabytes of data
challenge
challenge
Transport for London (TfL) manages London’s public transport network and road management while overseeing congestion charging and mobility initiatives.
To modernise their Road User Charging system and transition operations in-house, we helped consolidate petabytes of structured and unstructured data and migrate it to Azure Cloud, ensuring a seamless and efficient shift.
A 4-step data migration solution-design in a high-complexity environment
approach
approach
During data migration, we made sure to keep data integrity, business continuity, system performance, and compliance on point. To minimise downtime and keep things running smoothly, we paid careful attention to data analysis, preparation, validation, and risk mitigation—all within a structured four-step process. During the migration, we followed these four steps:
TfL 4 step approach

1. Requirement Analysis and Planning:
to identify systems, establish compliance, and define performance metrics;

2. Data Profiling and Assessment:
to analyse and cleanse source data; 

3. Data Mapping and Transformation Design:
to define mappings, transformation rules, and error-handling mechanisms;

4. Migration Strategy and Execution Plan:
to select the migration approach, design cloud architecture, and create a detailed execution plan and timeline.

Smart data orchestration: Managing complexity with dynamic pipelines
solution
solution
Our solution design effectively managed the complexity of data variety, structures, and complicated mapping logic. The core functionality of our migration system was a metadata-based dynamic pipeline orchestration, which captured pipeline runs, data validation issues, reconciled reports, and presented these results on a dashboard.

Our seasoned data engineering team carefully evaluated the project’s needs to select the most suitable tooling. While Microsoft Fabric and Data Factory were the original preferences, our assessment demonstrated that Databricks was the best fit for the job. Its cloud-independent nature provided flexibility, while its ability to handle massive-scale workloads ensured a seamless migration of TfL’s vast data assets. Additionally, Databricks offers superior data governance and cost control, making it the optimal choice for maintaining efficiency, security, and scalability.

Powering TfL’s next-gen road user charging system
impact
result
By successfully consolidating and migrating petabytes of data, we helped TfL streamline the operation of its modernised Road User Charging system. Our solution design enhanced performance efficiency through optimal infrastructure sizing and precise tooling, while providing a flexible architecture to accommodate future system changes. This included mapping and integrating data from seven complex source systems and 26 data bodies, processing over 9.16 billion documents, migrating more than 5 billion records, and securely transferring 1 petabyte of unstructured data into Microsoft Azure—delivering a future-ready foundation for digital transformation at scale.
7
source systems mapped and integrated
5.09
bn
records migrated
1
PD
of unstructured data transferred securely to Azure
Let's talk about your cutting-edge product!
Supercharge logo
The form is loading...
challenge
approach
solution
impact
reach us