
SRE Engineer – PL/SR
Posted May 19

Posted May 19
This is a fully remote position, open to applicants in Brazil.
• Define and enhance Site Reliability Engineering (SRE) practices.
• Work on improving the availability, performance, and resilience of systems.
• Implement and advance observability strategies (logs, metrics, and tracing).
• Lead initiatives for operational automation (AIOps / Infrastructure as Code).
• Engage in the analysis of critical incidents (P1/P2), including post-mortems and root cause analysis (RCA).
• Define and monitor SLIs, SLOs, and SLAs.
• Reduce manual operational effort (toil) through automation.
• Collaborate with development, cloud, and support teams.
• Support the evolution of a reliability culture and production engineering.
• Solid experience in SRE, DevOps, or Platform Engineering.
• Strong knowledge of cloud environments (GCP, AWS, or Azure).
• Experience with monitoring/observability tools (e.g., Prometheus, Grafana, Datadog, etc.).
• Familiarity with automation and Infrastructure as Code (Terraform, Ansible, etc.).
• Understanding of distributed systems and scalable architecture.
• Experience in analyzing and managing critical incidents.
• Knowledge of CI/CD pipelines.
• 100% remote work.
• Opportunities for professional development.
• Inclusive and diverse work environment.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.