C

Lead Data Engineer (Remote)

Circana
Full-time
Remote
United Kingdom
Data Engineer

Introduction

We are seeking a skilled and motivated Data Engineer to join a growing team Global Team based in the UK. In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a desire to make a significant impact, we encourage you to apply!

Job Responsibilities

Data Engineering & Data Pipeline Development

  • Design, develop, and optimize scalable DATA workflows using Python, PySpark, and Airflow
  • Implement real-time and batch data processing using Spark
  • Enforce best practices for data quality, governance, and security throughout the data lifecycle
  • Ensure data availability, reliability and performance through monitoring and automation.

Ā 

Cloud Data Engineering :

  • Manage cloud infrastructure and cost optimization for data processing workloads
  • Implement CI/CD pipelines for data workflows to ensure smooth and reliable deployments.

Ā 

Big Data & Analytics:

  • Build and optimize large-scale data processing pipelines using Apache Spark and PySpark
  • Implement data partitioning, caching, and performance tuning for Spark-based workloads.
  • Work with diverse data formats (structured and unstructured) to support advanced analytics and machine learning initiatives.

Ā 

Workflow Orchestration (Airflow)

  • Design and maintain DAGs (Directed Acyclic Graphs) in Airflow to automate complex data workflows
  • Monitor, troubleshoot, and optimize job execution and dependencies

Ā 

Team Leadership & Collaboration

  • Lead a team of data engineers, providing technical guidance and mentorship
  • Foster a collaborative environment and promote best practices for coding standards, version control, and documentation.

Desired Experience & Qualification

  • This a client facing role, strong communication and collaboration skills are vital
  • Experience in data engineering with expertise in Azure, PySpark, Spark, and Airflow.
  • Strong programming skills in Python, SQL with the ability to write efficient and maintainable code
  • Deep understanding of Spark internals (RDDs, DataFrames, DAG execution, partitioning, etc.)
  • Experience with Airflow DAGs, scheduling, and dependency management
  • Knowledge of Git, Docker, Kubernetes, Terraform, and apply best practices of DevOps for CI/CD workflows
  • Excellent problem-solving skills and ability to optimize large-scale data processing.
  • Experience in leading teams and working in Agile/Scrum environments
  • A proven track record of working effectively global remote teams

Ā 

Desirable:

  • Experience with data modelling and data warehousing concepts
  • Familiarity with data visualization tools and techniques
  • Knowledge of machine learning algorithms and frameworks

Interested?