C

Senior Scientific Data Engineer

Chemify
On-site
Glasgow, United Kingdom
Data Engineer

About Chemify

Chemify is revolutionising chemistry. We are creating a future where the synthesis of previously unimaginable molecules, drugs, and materials is instantly accessible. By combining AI, robotics, and the world’s largest continually expanding database of chemical programs, we are accelerating chemical discovery to improve quality of life and extend the reach of humanity.

Job Description:

We are seeking a Senior / Lead Scientific Data Engineer to lead the development of scalable, reliable data systems for scientific and experimental data.

You wlll architect and maintain pipelines that ingest, clean, and serve chemistry and drug discovery datasets, ensuring high performance and reproducibility. This role requires a strong foundation in Python, PostgreSQL, and modern data engineering practices and a keen interest in working in a cross-functional environment spanning software, chemistry, operations and program management.

​

If you enjoy problem solving complex technical challenges that make a real-world impact, are a natural communicator and are energized by working closely with scientists using cutting edge technologies, then we’d love to welcome you to our team.

​

​

Key Responsibilities:

​

  • Develop scalable data models and workflows covering a wide range of use cases in AI chemistry synthesis and manufacturing.
  • Lead on the architecture and optimisation of our data warehouse and data access layers, enabling our analytics team to rapidly deliver key operational insights.
  • Design, implement, and maintain robust data pipelines for a wide range of internal, client specific and literature-based data sets.
  • Develop scalable frameworks for data wrangling, transformation, and validation.
  • Define and champion data engineering best practices (versioning, testing, documentation, governance).
  • Enable feature teams with expertise on domain modelling and query optimisation.
  • Mentor junior colleagues, providing guidance on technical challenges.
  • Contribute to team-wide initiatives, including code reviews, design discussions, process improvements and workstream planning.

​

​

What you’ll bring:

  • BSc in scientific discipline.
  • 5+ years commercial Data Engineering experience, preferably within a Life Sciences or AI Drug discovery context.
  • Expertise in SQL/PostgreSQL (schema design, query optimisation, indexing, partitioning).
  • Experience enabling analytics teams by building data models for BI tools and dashboards.
  • Strong programming skills in Python, with experience in building and maintaining data-intensive applications.
  • Deep understanding of ETL concepts and building production-grade pipelines.
  • Experience orchestrating workflows and pipelines with Argo Workflows, Prefect or similar.
  • Familiarity with cloud-based data services (AWS/GCP/Azure).
  • Experience using AI-assisted coding and development tools (e.g. Claude Code, Cursor) as part of modern best practices.
  • Strong communication skills and a collaborative approach to mentoring and teamwork.

​

Beneficial Skills:

  • Interest in chemistry, manufacturing, or robotics.
  • Hands-on experience with scientific or drug discovery data (chemical, biological, or lab data).
  • Interest in semantic technologies (Graph Databases, Ontologies).
  • Exposure to ML engineering or high performance compute environments.
  • Experience with Infrastructure as Code such as Terraform and AWS CDK.