greenhouse

ProdOps Engineer 3 @ Black Duck Software, Inc.

BangaloreOnsiteFull-timePosted 76 days ago

Opens on greenhouse

About this role

Black Duck Software, Inc. helps organizations build secure, high-quality software, minimizing risks while maximizing speed and productivity. Black Duck, a recognized pioneer in application security, provides SAST, SCA, and DAST solutions that enable teams to quickly find and fix vulnerabilities and defects in proprietary code, open source components, and application behavior. With a combination of industry-leading tools, services, and expertise, only Black Duck helps organizations maximize security and quality in DevSecOps and throughout the software development life cycle.

Production Operations Engineer 3 – P3 (ProdOps / SRE)

Location: Bangalore – Hybrid

Experience: 5–8 years

Shift: 24/7 Rotational shifts (Including Night Shifts & Weekend On-Call)

About the Role

The Production Operations Engineer will support and stabilize large-scale production systems with a focus on incident management, monitoring, site reliability, and customer-facing communications. This is a hands-on role requiring ownership of critical production issues in a 24/7 environment.

Key Responsibilities

Own and manage Critical and High production incidents end-to-end.

Participate in SWARM / Tech Bridge calls and lead incidents during assigned shifts.

Improve MTTR, MTTA, alert quality, and operational stability.

Perform root cause analysis (RCA) and drive corrective actions.

Monitor production systems and proactively detect issues.

Automate operational tasks using Go, Python, Shell, or Perl.

Maintain dashboards, alerts, runbooks, and SOPs.

Handle customer-facing communications during incidents.

Coordinate with Engineering, Product, CloudOps, and Support teams.

Guide junior engineers and support shift handovers.

Lead automation initiatives to reduce toil and manual intervention.

Write and review operational automation using Go / Python / Shell / Perl.

Act as a technical reviewer for reliability‑critical changes.

Influence architecture decisions with operability and reliability in mind.

Own and standardize runbooks, SOPs, and disaster recovery processes.

Leadership & Mentorship

Provide technical leadership and mentorship to ProdOps engineers.

Guide shift teams during complex situations.

Support onboarding, training, and upskilling of team members.

Drive operational maturity across the team.

Tech Stack & Expertise

Required Technologies

Containers & Orchestration: Docker, Kubernetes, Helm

Cloud Platforms: AWS / GCP / Azure

Infrastructure as Code: Terraform

CI/CD: Jenkins, Harness, GitHub Actions, ArgoCD, GitLab CI

Monitoring & Observability: Prometheus, Grafana, ELK, Datadog, New Relic, Loki

Version Control: Git, GitHub, GitLab

Scripting: Go or Python or Shell or Perl

Qualifications

6+ years of experience in Production Operations, SRE, or Cloud Reliability roles.

Proven experience leading major production incidents in customer‑facing systems.

Strong background in distributed systems, Kubernetes, and cloud environments.

Experience mentoring engineers and driving reliability initiatives.

Excellent written and verbal communication skills.

What We Offer

An opportunity to be part of a dynamic and innovative team.

Inclusive and collaborative work environment.

Continuous learning and professional development opportunities.

Exposure to large-scale and customer-critical systems. Black Duck considers all applicants for employment without regard to race, color, religion, sex, gender preference, national origin, age, disability, or status as a Covered Veteran in accordance with federal law. In addition, Black Duck complies with applicable state and local laws prohibiting discrimination in employment in every jurisdiction in which it maintains facilities. Black Duck also provides reasonable accommodation to individuals with a disability in accordance with applicable laws.

Skills

4500 - Cloud Ops (GQP)

Ready to apply?

Install the ResuMinder extension and we'll auto-fill the application in seconds — no rewriting.

Get the extension →