
Senior Manager – RunOps
Posted 22 hours ago

Posted 22 hours ago
This is a fully remote position, open to applicants in Illinois.
• Implement the enterprise SRE strategy, which encompasses SLOs, SLIs, error budgets, and reliability roadmaps.
• Set reliability standards and practices for applications, backend services, APIs, data platforms, and AI workloads.
• Foster a culture of reliability-by-design and operational excellence within engineering teams.
• Champion the adoption of AIOps capabilities for proactive issue identification, reduction of alert noise, and prevention of predictive failures.
• Collaborate with the AI Platform team to incorporate LLMs and ML models into operational workflows, such as log summarization, anomaly detection, and remediation.
• Manage the enterprise observability strategy, focusing on metrics, logs, traces, and user experience monitoring.
• Oversee enterprise incident response, escalation processes, and post-incident learning through blameless postmortems.
• A Bachelor's degree in Computer Science or equivalent professional experience is required; each year of college not completed requires an additional year of experience.
• A minimum of 10 years of experience in production support or service delivery.
• Proven experience collaborating with a managed services vendor.
• ITIL Qualified with expert-level knowledge of ITIL disciplines.
• Experience in managing third parties and services delivered by third parties.
• Background in Service Management or Support within a large-scale and diverse environment, including incident management and escalation procedures.
• Medical, dental, and vision coverage.
• Paid time off.
• Retirement savings options.
• Wellness programs.
Sardine
DaVita Kidney Care
Sharecare
Manila Recruitment
Get handpicked remote jobs straight to your inbox weekly.