
Senior Site Reliability Engineer
Posted Jun 20

Posted Jun 20
This is a fully remote position, open to applicants in Colombia.
• Define and uphold SLIs/SLOs, oversee alignment and utilization of the error budget.
• Lead incident responses and conduct postmortems, implementing corrective actions.
• Automate operational tasks through tooling (e.g., auto-remediation, scaling rules).
• Develop, enhance, and sustain CI/CD pipelines, including canary deployments and blue/green strategies.
• Facilitate technical discussions with clients to ensure alignment on reliability, scalability, and performance needs.
• Propel ongoing platform enhancements throughout the service lifecycle, encompassing architecture, monitoring, and operational processes.
• Implement and expand observability systems (metrics, tracing, log aggregation).
• Optimize performance and cost by fine-tuning cloud services, autoscaling, and resource rightsizing.
• Design, deploy, and manage containerized workloads utilizing Docker and Kubernetes in production settings.
• Collaborate with development teams to integrate resilience patterns (circuit breakers, bulkheading).
• Engage in architecture discussions focused on high availability and disaster recovery.
• Mentor mid-level and junior SREs; perform reliability design reviews.
• 5–8 years of experience in a reliability or operations position.
• Cloud-agnostic certification: Terraform Associate, Certified Kubernetes Administrator (CKA), or SRE Foundation.
• Cloud provider certification: Professional-level certification in AWS (Solutions Architect), Azure (Solutions Architect Expert), GCP (Professional Cloud Architect), or Oracle Cloud (Architect Professional).
• Strong coding capabilities (Python, Go, or equivalent).
• Experience with Infrastructure as Code (IaC), CI/CD pipelines, and monitoring/observability stacks (Prometheus, Grafana, OpenTelemetry, ELK).
• Proficient with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK, Jaeger).
• Experience in distributed systems and production-scale services.
• Competitive salary and performance-based bonuses.
• Comprehensive health, dental, and vision insurance.
• Flexible work hours and remote working options.
• Professional development and continuous learning opportunities.
• Collaborative and inclusive work environment.
Innovative Solutions
Caspar Health
IVIX
Investigo
Get handpicked remote jobs straight to your inbox weekly.