
Senior Manager – RunOps
Posted May 10

Posted May 10
This is a fully remote position, open to applicants in Illinois.
• Implement the enterprise SRE strategy, encompassing SLOs, SLIs, error budgets, and reliability roadmaps.
• Set reliability standards and practices across applications, backend services, APIs, data platforms, and AI workloads.
• Foster a culture of reliability-by-design and operational excellence within engineering teams.
• Lead the adoption of AIOps capabilities for proactive issue identification, reduction of alert noise, and prevention of predictive failures.
• Collaborate with the AI Platform team to incorporate LLMs and ML models into operational workflows, such as log summarization, anomaly detection, and remediation.
• Oversee the enterprise observability strategy focusing on metrics, logs, traces, and user experience monitoring.
• Direct the enterprise incident response, escalation processes, and post-incident learning initiatives (blameless postmortems).
• A Bachelor's degree in Computer Science or equivalent work experience is required; one additional year of experience is necessary for each year of college not completed.
• Over 10 years of production support or service delivery experience.
• Experience collaborating with a managed services vendor.
• ITIL Qualified with expert knowledge of ITIL disciplines.
• Proven experience managing third-party vendors and services delivered by third parties.
• Experience in Service Management or Support within a large-scale, diverse environment, including incident management and escalation procedures.
• Medical, dental, and vision coverage
• Paid time off
• Retirement savings options
• Wellness programs
Sardine
DaVita Kidney Care
Sharecare
Manila Recruitment
Get handpicked remote jobs straight to your inbox weekly.