This is a fully remote position, open to applicants in Germany.

📋 Description

• Take ownership and enhance our Kubernetes platform across various clusters: manage Helm chart deployments via an OCI registry hosting over 40 charts, enforce policy-as-code using Kyverno, and implement GitOps workflows through Argo CD ApplicationSets with progressive delivery orchestrated by Kargo.

• Lead technical designs for platform projects: define the issue, propose several solutions, evaluate their trade-offs, and advocate for your recommendation — then ensure successful deployment to production.

• Strengthen the platform's security framework: implement workload identity through OIDC, enhance runtime security, perform image scanning, and manage secrets.

• Create and maintain custom Kubernetes operators and internal tools in Go and Python to amplify the team's efficiency across clusters — we utilize Zalando postgres-operator alongside our custom operators.

• Sustain and enhance our observability stack (Prometheus, Grafana, Thanos, OpenSearch): develop dashboards and alerts that provide product teams with clear visibility into their services.

• Ensure GitLab CI/CD pipelines are swift and dependable — facilitating around 150 production deployments monthly; manage cross-team changes (rollouts, database migrations, certificate rotations) with attention and clear communication.

• Manage and evolve AWS infrastructure (EKS, VPC, IAM, RDS, S3) including dedicated customer environments in their respective AWS accounts; lead cost-efficiency initiatives monitored via OpenCost.

• Handle incidents from start to finish: from alert to resolution to postmortem analysis.

• Elevate the team's standards through comprehensive code and architecture reviews, mentor junior engineers, and assist in evaluating technical candidates during interviews.

• Serve as the infrastructure liaison for product teams when issues arise or clarity is needed.

• Engage in a shared on-call rotation.

⛳️ Requirements

• Over 5 years of professional engineering experience, with at least 3 years focused on infrastructure, platform, or site reliability engineering.

• Extensive hands-on experience with Kubernetes: cluster management, workload administration, and troubleshooting at scale.

• Proficient in Helm chart creation: writing, packaging, and maintaining charts — not merely utilizing them.

• Familiarity with GitOps methodologies using Argo CD or similar tools (Flux, etc.).

• Knowledge of AWS services (EKS and associated services like IAM, VPC, RDS, S3).

• Experience with Infrastructure as Code (Terraform or equivalent tools).

• Proficiency in Go or Python — our custom operators and internal tools are developed in both languages.

• Proven experience managing production incidents from start to finish — including response, mitigation, and postmortem processes.

• Excellent English communication skills, capable of conveying technical decisions to both engineering and non-technical stakeholders.

🏝️ Benefits

• Flexible working arrangements (remote, office, or hybrid).

• Modern office located in the heart of Hanover for hybrid work.

• Up to 180 days (6 months) of remote work from abroad.

• Competitive compensation along with benefits offering 30 days (6 weeks) of paid vacation.

• Access to modern hardware and software solutions.

Senior Software Engineer, Platform & Infrastructure

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Staff Engineer – API & Data

Senior AI Product Engineer

Full-Stack Engineering Lead

Full Stack Developer

Senior Software Engineer

Senior Software Engineer – Knowledge Graph, GraphRAG

Never miss a great job!