About DFT Operator
Join Our Team at DFTO
DFTO is the government’s public sector rail owning group. Its purpose is to bring all currently privately-owned train operators into public ownership in advance of the creation of Great British Railways in 2027 - and deliver improvements in the here and now by unifying and integrating train operations under common public ownership.
DFTO has over 30,000 employees, runs over 8,500 services a day and delivers over 640 million customer journeys across its networks every year. 7,000 people joined the railway family in the last year
Major improvements are being delivered by DFTO train operators (TOCs) that are already under public ownership - these are LNER, Northern, TransPennine Express (TPE), Southeastern, South Western Railway (SWR), c2c, Greater Anglia and WM Trains.
We work closely with the DfT but operate independently with our own governance and leadership teams. Our priority is ensuring efficient, dependable rail services for everyone.
Primary Purpose of Job:
The Analytics Engineer is a core member of the DFTO Data function, responsible for the hands-on design and delivery of data products across the Common Data Service portfolio. The portfolio is DFTO's cross-industry data capability: ingesting, standardising, and publishing shared data products for use across the GB rail ecosystem, in preparation for the establishment of Great British Railways.
This is a “full-stack” data delivery role, combining data engineering, analytical modelling, and cross-organisational working. The postholder takes end-to-end ownership of their data products – from problem definition and ingestion pipeline design through to the analytical and presentation layer that makes those products discoverable, interpretable, and usable by functional teams across the GB railway ecosystem. The environment is genuinely multi-organisational from day one: the postholder will work with counterparts across train operating companies (TOCs), Rail Delivery Group (RDG), and Network Rail (NR) as delivery peers, earning credibility through the quality and consistency of their engineering output.
Key Responsibilities:
Cross-industry data product delivery
- Engage with functional teams across DFTO, TOCs, NR, and RDG to translate domain expertise and analytical need into well-scoped, deliverable data product designs, working with the Principal Analytics Engineer to maintain engineering coherence across concurrent initiatives.
- Build and deliver shared data products that are catalogued, governed, discoverable, and engineered to a standard that remains maintainable beyond the initial build effort.
- Design data models that reflect real-world railway concepts (e.g., passenger experience, train service delivery, rolling stock) and which support consistent, reusable analytics across the industry.
- Develop the analytical and presentation layer on top of shared data products (e.g., summaries, visualisations, and contextual documentation) so that the output is usable by functional teams on a self-serve basis.
- Ensure data products are structured, documented, and published in a way that supports machine learning, AI, and workflow automation use cases – including clear schema definitions, quality metadata, and access patterns that can be consumed programmatically.
- Document data processes, schemas, and transformation logic to a standard that allows engineers and analysts outside the central team to understand, validate, and build upon the outputs.
Data integration and modelling
- Build and maintain data ingestion pipelines across a multi-cloud platform environment, drawing in feeds from operational, performance, commercial, and third-party source systems across the railway ecosystem.
- Design and implement layered data transformations from raw ingestion through to cleansed, analytics-ready models, maintaining adherence to agreed architectural patterns.
- Develop reusable, generalisable ingestion and transformation patterns rather than bespoke per-source implementations, so that adding a new data source to the portfolio is a configuration exercise rather than a new engineering project.
- Contribute to shared data standards across the ecosystem – working with counterparts in TOCs, NR, and RDG to align schemas, definitions, and data quality expectations so that data products built at any level can interoperate with and build upon each other.
- Support cross-organisational data sharing at a technical level: governed access patterns, data catalogue publication, metadata standards, and the API or query surface through which data consumers interact with shared products.
Data engineering standards and governance practices
- Apply DataOps disciplines consistently across all delivery: CI/CD pipelines, Git version control, environment lifecycle management (development, test, production separation), role-based access controls, and peer review processes.
- Contribute to the definition and continuous improvement of shared data engineering standards across the cross-industry delivery community, including counterparts in TOCs, NR, and RDG.
- Maintain data quality and data catalogue entries across assigned products, including lineage documentation, quality metrics, and lifecycle status.
- Identify and surface delivery-level friction (e.g., supplier data access gaps, schema conflicts, governance bottlenecks) as structured inputs to the data standards and governance function for escalation and resolution.
Stakeholder and community engagement
Operate across organisational boundaries as a matter of routine, building credibility with counterparts who have deep domain knowledge of railway source systems and operations, and earning trust through the quality and consistency of engineering output rather than through positional authority.
Support the wider community of data analysts and engineers across the federated TOC, NR, RDG ecosystem in understanding and applying shared data engineering standards in practice – through documentation, direct engagement, and leading by example.
Knowledge, Skills, Experience & Technical Qualifications:
- Strong SQL and proficiency in at least one analytics programming language, with Python strongly preferred.
- Hands-on experience building and maintaining data ingestion and transformation pipelines, with a preference for config-driven and reusable patterns.
- Familiarity with layered data modelling approaches (staging, cleansed, and analytics-ready layers, or equivalent medallion-architecture thinking) and transformation frameworks such as dbt.
- Comfort working within cloud-native data platform environments, including columnar storage formats, partitioning, and query-optimised storage on AWS, Microsoft Azure/Fabric, or equivalent.
- DataOps practices: CI/CD pipelines, Git versioning, and environment lifecycle management.
- Familiarity with data visualisation and BI tooling (Power BI, Tableau, or similar) and an instinct for communicating data clearly to non-technical audiences.
- Strong analytical problem-solving ability: able to frame open-ended or poorly defined questions as structured data problems, and to communicate findings clearly to decision-makers.
- Ability to work independently across organisational boundaries, build relationships without formal authority, and translate high-level objectives into actionable delivery plans.
- Clear written and verbal communication, including the ability to document technical work to a standard usable by people outside the immediate team.
Desirable
Experience and delivery capability is more important than formal qualifications. We welcome candidates from non-traditional backgrounds who can demonstrate strong problem-solving capability and technical aptitude.
- A degree in a STEM, quantitative, or related field may be beneficial but is not required.
- Familiarity with data catalogue and data governance tooling (e.g., DataZone or similar metadata and lineage platforms).
- Exposure to semantic modelling, ontology design, or reference data management.
- Experience working in multi-stakeholder environments where influence has to be earned through quality of thinking and delivery.
- Familiarity with railway industry data sources such as TRUST, Darwin, LENNON, timetables, or train diagrams is desirable but not expected.
Organisational Context
The postholder will be part of a new central Data function within DFTO DDaT, working alongside a Principal Analytics Engineer and under the strategic direction of the Group Head of Data. The wider working community spans data professionals across publicly owned TOCs, NR, and RDG, all working toward the shared data capability which Great British Railways will require. The postholder is expected to engage with that community actively, contributing to and drawing from a genuinely collaborative cross-industry environment.
Vacancy Details:
Duration: Permanent
Location: London Waterloo
Salary: up to £72,700
Closing date: 16th June 2026
DFTO Benefits:
Annual Leave: Starting at 25 days and rising to an additional day per year of service completed within the first 5 completed years up to a maximum of 5 additional (30 days)
DC Pension Scheme: 10% Employer contribution, 5% Employee contribution
Opportunities to learn and network across the wider industry
Additional Information…
Disclaimer: Candidates applying for this position on a secondment basis must inform their line manager prior to submitting their application. This is to ensure transparency and facilitate any necessary discussions regarding workload and responsibilities.
About our people and the recruitment process - We're an inclusive employer of choice and we welcome applications from everyone! We encourage our colleagues to work flexibly, as we know traditional working patterns don't always fit. If you want to consider working flexibly, just let us know and we'll do our best to help and invest in your career with us, whilst you have a healthy work life balance.
Contact: If you have any questions or reasonable adjustments, please contact Amra.Hurley@dftoperator.co.uk.
Please do not email any CV's to us, your application must be made by clicking the 'Apply' button.