themuse

Cloud AIOps Architect @ EPAM Systems

Gurgaon, IndiaOnsiteFull-timePosted 447 days ago

Opens on themuse

About this role

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential. We are seeking an experienced Cloud AIOps Architect to lead the design and implementation of advanced AI-driven operational systems across multi-cloud and hybrid cloud environments. This role demands a blend of technical expertise, innovation, and leadership to develop scalable solutions for complex IT systems with a focus on automation, machine learning, and operational efficiency.

#LI-DNI#EasyApply Responsibilities Architect and design the AIOps solution leveraging AWS, Azure, and Cloud Agnostic services, ensuring portability and scalability Develop an end-to-end automated machine learning (ML) pipeline from data ingestion, DataOps, model training, to inference pipelines across multi-cloud environments Design hybrid architectures leveraging cloud-native services like Amazon SageMaker, Azure Machine Learning, and Kubernetes for development, model deployment, and orchestration Design and implement ChatOps integration, allowing users to interface with the platform through Slack, Microsoft Teams, or similar communication platforms Leverage Jupyter Notebooks in AWS SageMaker, Azure Machine Learning Studio, or cloud-agnostic environments to create model prototypes and experiment with datasets Lead the design of classification models and other ML models using AWS SageMaker training jobs, Azure ML training jobs, or open-source tools in a Kubernetes container Implement automated rule management systems using Python in containers deployed to AWS ECS/EKS, Azure AKS, or Kubernetes for cloud-agnostic solutions Architect the integration of ChatOps backend services using Python containers running in AWS ECS/EKS, Azure AKS, or Kubernetes for real-time interactions and updates Oversee the continuous deployment and retraining of models based on updated data and feedback loops, ensuring models remain efficient and adaptive Design platform-agnostic solutions to ensure that the system can be ported across different cloud environments or run in hybrid clouds (on-premises and cloud) Requirements 13+ years of overall experience and 7+ years of experience in AIOps, Cloud Architecture, or DevOps roles Hands-on experience with AWS services such as SageMaker, S3, Glue, Kinesis, ECS, EKS Strong experience with Azure services such as Azure Machine Learning, Blob Storage, Azure Event Hubs, Azure AKS Hands-on experience working on the design, development, and deployment of contact centre solutions at scale Proficiency in container orchestration (e.g., Kubernetes) and experience with multi-cloud environments Experience with machine learning model training, deployment, and data management across cloud-native and cloud-agnostic environments Expertise in implementing ChatOps solutions using platforms like Microsoft Teams, Slack, and integrating them with AIOps automation Familiarity with data lake architectures, data pipelines, and inference pipelines using event-driven architectures Strong programming skills in Python for rule management, automation, and integration with cloud services Experience in Kafka, Azure DevOps, and AWS DevOps for CI/CD pipelines We offer Opportunity to work on technical challenges that may impact across geographies Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications Opportunity to share your ideas on international platforms Sponsored Tech Talks & Hackathons Unlimited access to LinkedIn learning solutions Possibility to relocate to any EPAM office for short and long-term projects Focused individual development Benefit package: Health benefits Retirement benefits Paid time off Flexible benefits Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Skills

Software Engineering

Ready to apply?

Install the ResuMinder extension and we'll auto-fill the application in seconds — no rewriting.

Get the extension →
See how your CV scores — free
Cloud AIOps Architect at EPAM Systems | ResuMinder Jobs