
Site Reliability Engineer II
Posted May 9

Posted May 9
This is a fully remote position, open to applicants in Kentucky, +1 more state.
• Design and develop reliability solutions for data ingestion, processing, and delivery pipelines.
• Establish and uphold SLIs/SLOs for data licensing services while managing error budgets.
• Create automation for deployment, monitoring, and incident response.
• Improve system observability through metrics, logging, and tracing.
• Build and maintain dashboards and alerts to proactively identify and address issues.
• Engage in on-call rotations and spearhead incident response initiatives.
• Perform root cause analysis and facilitate post-incident enhancements.
• Keep runbooks and operational documentation current.
• Collaborate with software and data engineers to integrate reliability into system design.
• Contribute to blameless postmortems and reliability assessments.
• Share expertise and mentor junior team members.
• A minimum of 2 years of experience in SRE, DevOps, or infrastructure engineering.
• In-depth knowledge of cloud platforms (AWS, GCP, or Azure), container orchestration (Kubernetes), and infrastructure-as-code (Terraform, CloudFormation).
• Proficient with observability tools (e.g., Prometheus, Grafana, Splunk) and CI/CD pipelines.
• Familiarity with data platforms, ETL pipelines, and distributed systems.
• Strong problem-solving and communication abilities.
• Experience with Python, PowerShell, and similar programming languages.
• Active utilization of artificial intelligence (AI) tools and techniques to enhance performance, foster innovation, and improve decision-making across business functions.
• Capability to leverage AI tools and platforms to optimize workflows, enhance decision-making, and drive innovation.
• Demonstrated curiosity and adaptability in exploring new AI technologies, with a commitment to continuous learning and experimentation.
• Preferred Qualifications: Experience with data licensing, data governance, or data compliance frameworks.
• Exposure to data pipeline tools (e.g., Apache Airflow, Kafka, Spark).
• Understanding of regulatory requirements related to data usage and distribution.
• Competitive total rewards (base salary + bonus, if applicable).
• Customizable benefits package (3 medical plans with Health Saving Account company match).
• Generous paid time off for non-exempt team members, starting with 3 weeks + 13 paid holidays, including 2 personal floating holidays.
• Flexible time off for exempt team members + 13 paid holidays.
• Paid parental leave (including maternity + paternity leave).
• Education assistance opportunities and free LinkedIn Learning access.
• Free mental health and family planning programs, including adoption assistance and fertility support.
• 401(K) program with company match.
• Pet insurance.
• Employee resource groups.
Innovative Solutions
Caspar Health
IVIX
Investigo
Get handpicked remote jobs straight to your inbox weekly.