About this role
Skillset Hands-on experience with GCP (BigQuery, Composer, Dataproc, Dataflow, Pub/Sub) Working knowledge of Databricks for data processing and pipelines Strong programming skills in SQL and PySpark Experience with Oracle and SQL Server databases Understanding of data modelling (dimensional / medallion basics) Experience in batch pipelines and exposure to streaming Familiarity with data ingestion, ETL/ELT processes Basic understanding of data quality and validation Exposure to Airflow/Composer and scheduling Working knowledge of Git and CI/CD Detailed Responsibilities Develop and maintain batch and streaming data pipelines Build SQL and PySpark transformations for large datasets Ingest data from multiple sources into BigQuery/cloud platforms Implement data transformation logic as per requirements Work with teams to consume curated datasets Develop workflows using Cloud Composer (Airflow) Perform data validation and reconciliation Optimise queries and pipelines for performance Support migration of legacy ETL jobs to cloud Troubleshoot pipeline and performance issues Participate in testing, UAT, and deployment Document pipelines, transformations, and sources Collaborate with analytics and reporting teams Skillset Hands-on experience with GCP (BigQuery, Composer, Dataproc, Dataflow, Pub/Sub) Working knowledge of Databricks for data processing and pipelines Strong programming skills in SQL and PySpark Experience with Oracle and SQL Server databases Understanding of data modelling (dimensional / medallion basics) Experience in batch pipelines and exposure to streaming Familiarity with data ingestion, ETL/ELT processes Basic understanding of data quality and validation Exposure to Airflow/Composer and scheduling Working knowledge of Git and CI/CD Detailed Responsibilities Develop and maintain batch and streaming data pipelines Build SQL and PySpark transformations for large datasets Ingest data from multiple sources into BigQuery/cloud platforms Implement data transformation logic as per requirements Work with teams to consume curated datasets Develop workflows using Cloud Composer (Airflow) Perform data validation and reconciliation Optimise queries and pipelines for performance Support migration of legacy ETL jobs to cloud Troubleshoot pipeline and performance issues Participate in testing, UAT, and deployment Document pipelines, transformations, and sources Collaborate with analytics and reporting teams