arbeitnow

Platform Engineer @ Zalion

MunichOnsiteFull-timePosted 126 days ago

Opens on arbeitnow

About this role

Zalion is on a mission to eliminate repetitive procurement work through agentic AI. We’re building autonomous agents that operate deep within enterprise procurement — navigating messy data, legacy systems, and complex workflows to deliver real impact.

Join us early and help define how enterprise AI is done right.

Tasks You will:

Own our platform foundations end-to-end — from AWS architecture and IaC to CI/CD, observability, and incident readiness. Build and evolve secure, scalable AWS infrastructure (networking, compute, storage, IAM) optimized for reliability and cost. Design and maintain CI/CD pipelines on GitHub that are fast, repeatable, and developer-friendly (clear feedback loops, safe deploys, strong defaults). Define and operate infrastructure using Terraform — with clean modules, sensible standards, and automated validation. Improve developer experience through golden paths: templates, self-service environments, paved roads for deployments, and internal tooling that removes friction. Drive availability, scalability, and resilience: deployment strategies, rollbacks, capacity planning, DR thinking, and performance tuning. Implement pragmatic security-by-default: least privilege IAM, secrets management, secure supply chain, and guardrails that enable speed without compromising safety. Establish and refine observability and reliability practices (SLOs/SLIs, monitoring, alerting, postmortems, runbooks) that scale with the team. Partner closely with product engineering to reduce operational load and keep delivery velocity high as Zalion grows.

Requirements

Strong experience as a Platform / DevOps / Site Reliability Engineer in product teams shipping to production. Deep practical knowledge of AWS: networking, IAM, security controls, and designing for failure. Hands-on expertise with Terraform: modules, state strategy, DRY patterns, environment separation, and automated reviews. Solid CI/CD engineering experience with GitHub: pipeline design, artifact/versioning, deployment safety, and fast feedback loops. A strong mindset for reliability and operability: you think in failure modes, automation, and measurable outcomes (SLOs). Security awareness and discipline: you build guardrails that make the secure path the easy path. A builder mindset: you ship improvements, measure impact (lead time, deploy frequency, MTTR), and iterate. Comfort with ambiguity and ownership: you proactively identify platform bottlenecks and fix them without waiting for perfect specs. 4+ years experience in relevant roles (startup/scale-up experience is a plus).

Benefits

Build the platform behind agentic AI systems that run in real enterprise environments Massive autonomy, zero bureaucracy Immediate impact — your work accelerates every engineer and every release Modern stack, no legacy constraints Competitive salary + meaningful equity High-end equipment

🛠️ Tech Stack You’ll Work With

AWS (core services; compute, networking, IAM, logging/monitoring, managed data services) Terraform (modules, workspaces, validation, state management) GitHub (Actions, CI/CD workflows, checks, release automation) Containers orchestration (e.g., ECS/Fargate and/or Kubernetes depending on evolution) Observability tooling (metrics, logs, tracing; e.g., Grafana/Prometheus/OpenTelemetry and friends) Security tooling (SAST/DAST, dependency scanning, secrets scanning, policy as code

Skills

IT

Ready to apply?

Install the ResuMinder extension and we'll auto-fill the application in seconds — no rewriting.

Get the extension →