Remotery

Senior Site Reliability Engineer

Posted 8 hours ago

This is a fully remote position, open to applicants in North America.

📋 Description

• Act as a seasoned individual contributor accountable for the availability, performance, and reliability of a large federal enterprise cloud platform that operates continuously.

• Assist in achieving scope, schedule, and delivery objectives while influencing the platform's reliability strategy.

• Define and uphold service level objectives (SLOs), service level indicators, and error budgets, while steering the platform towards these goals.

• Design and manage observability through metrics, logging, tracing, and alerting systems.

• Lead incident response and on-call procedures, encompassing escalation, mitigation, and enhancements in time-to-recovery.

• Facilitate blameless postmortems and drive systemic reliability enhancements.

• Develop automation to minimize toil and boost operational efficiency.

• Independently design reliable cloud infrastructure using AWS and Kubernetes (Amazon EKS).

• Create reusable modules and guide engineers on best practices for reliability.

• Present design documents and system diagrams to stakeholders effectively.

• Engage in technical depth interviews with prospective candidates.


⛳️ Requirements

• Bachelor's degree and a minimum of 7 years of experience; relevant experience may be considered in lieu of formal education.

• Proven experience in owning reliability (SLOs, observability, incident response) for production systems.

• Proficient knowledge of at least one infrastructure-as-code tool, with a preference for Terraform.

• Extensive understanding of cloud infrastructure, containerization, and networking concepts.

• Ability to obtain and maintain a U.S. Public Trust / suitability determination.


🏝️ Benefits

• Company-subsidized health, dental, and vision insurance.

• Flexible paid time off (PTO).

• 401K plan with employer matching.

• Paid parental leave after one year of employment.

• Employee Assistance Program.

People also viewed

Ping Identity8 hours ago

Staff Site Reliability Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$136.3k – $170k/year
ApplyView job
Stack AV8 hours ago

Site Reliability Engineer

US flagPennsylvania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
May Mobility8 hours ago

Autonomy Release Engineer II

US flagMichigan OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$128k – $165k/year
ApplyView job
Practical DevSecOps8 hours ago

Senior Security Engineer, Content Engineering

US flagCalifornia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
High 5 Games8 hours ago

DevOps Engineer – ML & Data Infrastructure

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Mercury Insurance8 hours ago

Manager – Site Reliability Operations

US flagCalifornia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$118.7k – $230.6k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers