workable

Senior Azure Site Reliability Engineer (WFH) - #34701 @ Manila Recruitment

PhilippinesOnsiteFull-timePosted 75 days ago

Opens on workable

About this role

You will be responsible to provisioning and managing of cloud infrastructure on Azure public cloud to support organizational needs. is responsible for ensuring the reliability, availability, and performance of cloud-based infrastructure and applications deployed on Microsoft Azure. This role involves automating operations, monitoring system health, optimizing performance, and troubleshooting complex issues to maintain a highly available and secure cloud environment. The SRE will work closely with development, security, and IT operations teams to enhance cloud solutions, implement best practices, and support scalable and resilient systems.

Deploy and manage Azure cloud services including Virtual Machines, Storage, Redis, Azure SQL databases, virtual networks, and AKS clusters (Azure Kubernetes Service).Automate provisioning, configuration, and deployments using PowerShell, Bash, and Ansible.Deliver and deploy Azure infrastructure using Infrastructure as Code (IaC), specifically Azure bicepReview, Configure and implement monitoring functionalities to provide best visibility and transparency to level 1 support teams.Implement and Troubleshoot CI/CD pipelines for application deployments in Azure DevOps, Team City, OctopusMaintain system reliability using Azure Monitor, Application Insights, Log Analytics, and Prometheus/Grafana, Splunk, Ops-Genie, Slack.Optimize performance and cost efficiency of Azure resources.Train junior members of the team to deliver best of breed solutions on top of Azure public cloud.Review, manage, and troubleshoot Azure Kubernetes Service (AKS) clusters.Review and Manage Cloud and On-Prem servers including AKS in terms of OS, RMQ Upgrades, Security Patches, Application Service support.Respond to system alerts, failures, and security incidents Perform root cause analysis (RCA) and implement preventive measuresProvide Level 2 support in on-call capacity based on pre-approved schedule (including weekends)Review the network and security design for all infrastructure and applications hosted in Azure.Continuously promote better ways to deliver Infrastructure solutions on Azure cloud.Propose adoption of new approaches, patterns, techniques, and ideas recommended by industry standards and industry trends.Work closely with Software development and network teams to enhance platform reliability and identity better approaches.Administer and optimize Linux-based systems used for application hosting, ensuring stability, security, and performance in production and non-production environments.Troubleshoot issues in Linux operating systems, services, and middleware components to support application availability.Requirements

At least 3 years of proven experience in delivering infrastructure solutions on Azure cloud.5+ years of hands-on experience with infrastructure design and deployment utilizing PaaS, SaaS and IaaS cloud offerings.At least 2 years of experience with Windows ServerExperience with either Azure ARM templates or Azure BicepsAt least 3 years of experience in Linux Administration and managing Linux Based OS, ApplicationsAt least 2 years of hands-on experience designing, building, and deploying containerized runtime environments based on Azure Kubernetes Services1+ years of proven experience administering RabbitMQ clusters and NginxProven experience with scripting languages like: PowerShell, Python, JavaScript, BashExperience using Splunk, Grafana, Ops-Genie is an asset

Advantageous skills:

Relevant certifications

Skills

MRMid-Senior level

Ready to apply?

Install the ResuMinder extension and we'll auto-fill the application in seconds — no rewriting.

Get the extension →