Remotery

Principal AIOps Engineer

atCVS HealthUS flagPennsylvaniaFull-timeOperationsLead$144.2k – $288.4k/year

Posted Jun 21

This is a fully remote position, open to applicants in Pennsylvania.

📋 Description

• Spearhead the AIOps strategy, roadmap, and operational model to significantly enhance MTTR, alert quality, and overall operational efficiency.

• Take ownership of the observability-to-AIOps pipeline and promote the standardization of telemetry, service health models, and actionable alerting.

• Design and execute event intelligence initiatives, including correlation, deduplication, suppression, anomaly detection, incident clustering, and probable-cause analysis.

• Provide guidance to operations, service owners, and leadership stakeholders; facilitate change enablement, adoption, and value assessment for AIOps.

• Develop AIOps integrations centered around ServiceNow, including event ingestion, alert-to-incident policies, enrichment, and assignment/routing.

• Establish governance for operational AI in collaboration with security, compliance, and operations teams.

• Construct and operationalize agentic AI workflows to assist with incident triage and resolution.

• Enable closed-loop automation and self-healing by linking AIOps detections to orchestrated actions.

• Collaborate with NOC/SOC, infrastructure, and application owners to facilitate the onboarding of services into AIOps.

• Produce enablement materials and mentor teams on AIOps methodologies, agentic AI application, and responsible automation practices.


⛳️ Requirements

• Over 10 years of experience in SRE and production operations supporting highly available services.

• Demonstrated technical leadership: capability to set direction, lead cross-team initiatives, and guide stakeholders through architecture assessments.

• Proficient programming/scripting skills (Python preferred) and experience in creating automation, integrations, and APIs.

• Experience in integrating observability platforms and event sources within hybrid environments (cloud/on-prem) while managing production-grade monitoring/event management at scale.

• Strong familiarity with ServiceNow as an ITSM system of record.

• Ability to build and manage integrations at scale (REST, webhooks, event management) to facilitate automation and ensure auditability.

• Expertise in Automation & Integration Engineering: Python (preferred) for automation and data/ML pipelines.

• Experience in developing integrations, services, and operational tools.

• Knowledge of AIOps, ITSM/ITOM (ServiceNow), and the Agentic AI Ecosystem: Observability tools such as Prometheus/Grafana, OpenTelemetry, ELK/Splunk/Datadog (or equivalents).

• Strong fundamentals in Linux and networking (TCP/IP, DNS, TLS, load balancing) with the ability to troubleshoot distributed systems comprehensively.

• Excellent communication skills.


🏝️ Benefits

• Medical, dental, and vision coverage.

• Paid time off.

• Retirement savings options.

• Wellness programs.

• Additional resources based on eligibility.

People also viewed

Sardine9 hours ago

Rev Ops Manager

US flagUnited States OnlyFull-timeOperations$140k – $180k/year
ApplyView job
DaVita Kidney Care9 hours ago

Revenue Lead, ROPS

US flagUtah OnlyFull-timeOperations$22 – $31/hour
ApplyView job
Sharecare9 hours ago

Manager, Operations – Medical Record Retrieval

US flagUnited States OnlyFull-timeOperations
ApplyView job
Manila Recruitment9 hours ago

Talent & Operations Lead

PH flagPhilippines OnlyFull-timeOperations
ApplyView job
Servbank9 hours ago

Deposit Operations Specialist

US flagUnited States OnlyFull-timeOperations$23 – $26/hour
ApplyView job
Westinghouse Electric Company9 hours ago

Product Manager – Uprating, Plant Performance, LT Operations

US flagPennsylvania OnlyFull-timeOperations$130.4k – $163k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers