Remotery

Lead DevOps/SRE Engineer

Posted 1 hour ago

📋 Description

• Take charge of and enhance Launch Potato's cloud infrastructure, CI/CD platform, and compliance framework.

• Develop the Site Reliability Engineering (SRE) function from the ground up, enabling product teams to deliver faster while maintaining reliability, security, and cost efficiency.

• Establish the SRE practice from the ground up, including on-call rotations, PagerDuty configurations, SLA/SLO definitions for essential infrastructure services, a runbook library, and observability dashboards that connect site performance to business metrics.

• Complete the migration to AWS multi-account architecture: transition production workloads to an isolated account with no unplanned downtime.

• Produce an audit-ready infrastructure evidence package for SOC 2 Type I: oversee the end-to-end implementation of technical controls.

• Version and publish the Terraform module library (30+ modules) to a private registry, reducing ad hoc Git usage by product teams.

• Implement automated rollback for ECS and Lambda deployments, ensuring production gates are based on successful integration test results.

• Provide monthly cost reports to leadership, including budget anomaly detection, recommendations for savings plans, and spending breakdown by service/team/environment.


⛳️ Requirements

• Over 5 years of experience with production AWS infrastructure and extensive expertise in Terraform.

• Proven hands-on experience in building the SRE function from the ground up with complete ownership.

• Familiarity with a multi-site organization where PaaS or microservices are essential.

• Previous ownership of CI/CD pipelines in one or more roles.

• Experience with PagerDuty and establishing on-call rotations.

• More than 5 years of hands-on experience with AWS, Terraform, CI/CD pipeline ownership, and SRE tools (OpenTelemetry, Grafana, PagerDuty, or similar) in a production setting.


🏝️ Benefits

• Profit-sharing bonus

• Competitive benefits

People also viewed

Xtremepush1 hour ago

Senior DevOps Engineer, AWS

LT flagLithuania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
BI2run1 hour ago

BI DevOps Engineer – m/w/d

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)€50k – €70k/year
ApplyView job
S + S Regeltechnik GmbH1 hour ago

Team Leader – DevOps

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
NVIDIA1 hour ago

Senior Network Reliability Engineer – DGX Cloud

US flagCalifornia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$136k – $264.5k/year
ApplyView job
Newfold Digital1 hour ago

Principal Dev Ops Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
AceHack 4.01 hour ago

Site Reliability Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$180k – $250k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers