
Senior Manager, Monitoring & Observability
Posted 4 hours ago

Posted 4 hours ago
• Define and take ownership of the enterprise observability and event management strategy, roadmap, and success metrics (e.g., MTTR reduction, alert quality, incidents captured by monitoring).
• Lead and provide mentorship to multiple monitoring and alerting teams across infrastructure, application, network, and cloud observability platforms.
• Establish governance, standards, and operational models for monitoring, alerting, and event management throughout the enterprise.
• Act as the senior escalation point for observability-related issues during significant incidents and platform outages.
• Oversee and provide strategic direction for enterprise monitoring and observability platforms.
• Drive the integration and alignment of tools to minimize silos, eliminate duplicate alerts, and facilitate unified visibility.
• Ensure effective lifecycle management of observability tools, maximizing returns on existing investments.
• Lead engineering initiatives to enhance event correlation, enrichment, and service-level context (shifting from alert-based monitoring to outcome-based observability).
• Advocate for automation and intelligence in detection, correlation, and triage, including capabilities driven by AIOps.
• Collaborate with architecture and engineering teams to incorporate observability standards into application, cloud, and platform designs.
• Enhance the quality of telemetry (metrics, logs, traces, events) and ensure data is usable for troubleshooting, trend analysis, and leadership reporting.
• Improve the alert signal-to-noise ratio and mitigate chronic alert fatigue through standards, tuning, and correlation strategies.
• Ensure that monitoring effectively supports incident response, enabling rapid root cause identification and resolution.
• Define and monitor KPIs and operational health metrics for observability platforms and teams.
• Foster continuous improvement through post-incident reviews, trend analysis, and proactive gap identification.
• Serve as a trusted partner to infrastructure, application, cloud, and operations leadership.
• Align observability priorities with business outcomes and service reliability objectives.
• Manage vendor relationships and influence product roadmaps based on enterprise requirements.
• 10+ years of experience in IT operations, monitoring, observability, or reliability engineering, with a significant portion in a leadership or management capacity.
• Strong hands-on engineering background in enterprise monitoring and event management systems.
• Proven experience in transitioning organizations from reactive monitoring to proactive, correlated event management or observability models.
• Experience in leading teams responsible for large-scale, multi-tool monitoring environments.
• Comprehensive understanding of incident management, ITSM integration, and service health models.
• Exceptional communication skills, with the capacity to translate technical data into executive-level insights.
• Health and Welfare Benefits: Our health and welfare benefits can be customized to meet you and your family's needs and commence on your first day of employment.
• Retirement Savings: We support you in saving for your future.
• Employee Discounts: Access a wide array of global, national, and local discounts on merchandise, services, travel, and more.
• Career Growth Opportunities: We are committed to your success, providing opportunities for career advancement within our extensive portfolio of businesses and global reach.
• Paid Training: Earn while you learn and continue to advance with access to award-winning learning platforms throughout your Conduent career.
• Paid time off: We offer attractive paid time off packages designed for you to enjoy your life outside of work.
• Great Work Environment: We take pride in our award-winning culture and the recognition we’ve received for our diversity initiatives.
CVS Health
Docker, Inc
Inway Systems GmbH
Doosan
Get handpicked remote jobs straight to your inbox weekly.