About this role
The Specialist – Data Platform Engineering is responsible for designing, building, and maintaining scalable data pipelines and systems that enable analytics, insights, and business intelligence across the organization. This role will own the enterprise Data Lake and Databricks environment, ensuring secure, high-performance data flow across MySQL, Kafka, and Spark. The ideal candidate combines strong technical expertise in distributed data systems with hands-on experience in modern data engineering practices.
Roles & Responsibilities
Data Infrastructure Ownership
Own and manage the enterprise Data Lake infrastructure on AWS and Databricks, ensuring scalability, reliability, and governance.Design, develop, and optimize data ingestion and transformation pipelines from MySQL to Kafka (CDC pipelines) and from Kafka to Databricks using Spark Structured Streaming.Build and maintain robust batch and real-time data pipelines to support high-volume, high-velocity data needs.Data Processing & Optimization
Design and implement efficient MapReduce jobs to process and transform large-scale datasets across distributed systems.Optimize MapReduce workflows for performance, scalability, and fault tolerance in big data environments.Develop metadata-driven frameworks for processing consistency, lineage, and traceability.System Reliability & Automation
Implement observability and monitoring systems using Prometheus, Grafana, or equivalent tools to ensure proactive detection and resolution of issues.Apply best practices in code quality, CI/CD automation (Jenkins, GitHub Actions), and Infrastructure-as-Code (IaC) for consistent deployments.Continuously optimize system performance and reliability through monitoring, tuning, and fault-tolerant design.Collaboration & Compliance
Work cross-functionally with Product, Regulatory, and Security teams to ensure compliance, data privacy, and quality across the data lifecycle.Collaborate with multiple teams to design and deliver end-to-end lakehouse solutions that integrate diverse data sources.Stay current with emerging technologies in data engineering, streaming, and distributed systems, and contribute to continuous improvement initiatives. Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. 4-6 years of experience in data engineering, data platform management, or related domains.Strong programming expertise in one or more of the following: Scala, Spark, Java, or Python.Proven experience building event-driven or CDC-based pipelines using Kafka (Confluent or Apache).Hands-on experience with distributed data processing frameworks such as Apache Spark, Databricks, or Flink.Deep understanding of AWS cloud services (S3, Lambda, EMR, Glue, IAM, CloudWatch).Solid experience deploying and managing workloads on Kubernetes (EKS preferred).Experience designing and managing data lakehouse architectures and implementing data governance principles.Familiarity with CI/CD pipelines (Jenkins, GitHub Actions) and monitoring frameworks (Prometheus, Grafana, ELK stack).Excellent problem-solving, communication, and collaboration skills.Skills Inventory
Data Lakehouse Architecture (AWS, Databricks)Real-time Data Streaming (Kafka, Spark Structured Streaming)Distributed Data Processing (Spark, MapReduce, Flink)Programming (Scala, Python, Java)CI/CD Automation & Infrastructure as CodeKubernetes (EKS)Monitoring & Observability (Prometheus, Grafana)Data Governance & Metadata ManagementCloud Infrastructure (AWS)Cross-Functional Collaboration & Problem Solving At Freshworks, we have fostered an environment that enables everyone to find their true potential, purpose, and passion, welcoming colleagues of all backgrounds, genders, sexual orientations, religions, and ethnicities. We are committed to providing equal opportunity and believe that diversity in the workplace creates a more vibrant, richer environment that boosts the goals of our employees, communities, and business. Fresh vision. Real impact. Come build it with us.
