Remotery

Manager, Site Reliability Engineering

Posted Jun 21

This is a fully remote position, open to applicants in United States.

đź“‹ Description

• Lead and expand the Site Reliability Engineering (SRE) team.

• Enhance reliability, performance, and system availability.

• Focus on operational intelligence and AI-driven operations.

• Improve platform efficiency while fostering stakeholder trust.


⛳️ Requirements

• Over 10 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or similar production operations roles.

• At least 4 years of direct experience in people management, including hiring, performance evaluation, career development, and managing remote on-call teams.

• Proven accountability for reliability outcomes in customer-facing SaaS at scale, including defining and implementing SLOs/SLIs/error budgets to inform engineering priorities.

• Extensive experience with Azure—3+ years managing production workloads on Azure, with hands-on expertise in AKS, networking, identity, and platform services. Equivalent experience in AWS or GCP will also be considered.

• Proficient in modern observability tools with production-grade experience in Datadog (or similar tools like New Relic, Dynatrace, AppDynamics) across metrics, logs, traces, Real User Monitoring (RUM), and synthetics.

• Experience with AI in operational contexts—hands-on integration of AI/LLM-assisted tools into workflows for incident summarization, runbook creation, log analysis, anomaly triage, and change risk assessment.

• Incident management experience—demonstrated capability to oversee severity-1 incidents from start to finish, conduct blameless post-mortems, and implement systemic improvements based on insights gained.

• Familiarity with regulated environments—operates with a mindset aligned to HIPAA, PHI, SOC 2, or equivalent compliance standards as a fundamental principle.

• Exceptional communication skills—ability to convey reliability efforts in terms of business outcomes to executive, product, and customer-facing stakeholders.

• Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related discipline, or a suitable combination of education, training, and experience.


🏝️ Benefits

• Complimentary premium medical, dental, life, and vision insurance.

• Competitive 401(k) matching.

• Additional benefits offered to eligible employees, as required by applicable laws, including reimbursements and discretionary bonuses.

• Paid sick leave in compliance with all relevant state, federal, and local regulations, with an accrual rate of one hour of paid sick leave for every 30 hours worked. However, if any statements conflict with local paid sick leave laws, the latter will take precedence.

• Celebratory events! We achieve our goals and reward ourselves.

• Company-sponsored virtual events, happy hours, and team-building activities are always planned, plus you receive a special treat on your birthday!

• Unlimited DTO—because we believe in taking time off!

• Daily virtual yoga, meditation, or boot camp classes available.

People also viewed

Innovative Solutions2 hours ago

Cloud Engineer – DevOps

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$100k – $160k/year
ApplyView job
Caspar Health2 hours ago

DevSecOps/DevOps Engineer

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
IVIX2 hours ago

Deployment Engineer

US flagNew York OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Investigo12 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind12 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal12 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers