
SRE – AWS/GCP
Posted May 25

Posted May 25
This is a fully remote position, open to applicants in Brazil.
• Ensure the availability and reliability of production systems;
• Implement and uphold monitoring, observability, and alerting solutions;
• Address incidents, conduct root cause analysis (RCA), and establish remediation plans;
• Automate operational tasks and repetitive processes (Infrastructure as Code);
• Collaborate with CI/CD pipelines for secure and continuous deployment;
• Manage and enhance cloud environments (AWS and GCP);
• Apply best practices for resilience, scalability, and fault tolerance;
• Define and monitor SLIs, SLOs, and SLAs;
• Support development teams in creating more resilient applications;
• Conduct capacity planning and cost optimization (basic FinOps);
• Document processes, architectures, and operational playbooks.
• Experience with cloud environments, particularly AWS and/or GCP;
• Proficiency in Linux/Unix systems;
• Experience with monitoring tools (e.g., Prometheus, Grafana, CloudWatch, Stackdriver);
• Understanding of containers and orchestration (Docker and Kubernetes);
• Experience with Infrastructure as Code (Terraform, CloudFormation, or similar);
• Automation expertise with languages such as Python, Bash, or Go;
• Familiarity with CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, etc.);
• Networking knowledge (VPC, DNS, load balancing);
• Awareness of cloud security (IAM, access policies, best practices);
• Experience with multi-cloud environments;
• Understanding of DevOps practices and agile culture;
• Proficiency with distributed observability tools (OpenTelemetry, Datadog, New Relic);
• Knowledge of service mesh (Istio, Linkerd);
• Experience with messaging systems (Kafka, Pub/Sub, SQS);
• Familiarity with Chaos Engineering practices;
• Understanding of FinOps (cloud cost management).
• Position is also available for candidates with disabilities (PwD).
Work Life Group
accesa.eu
Cisco
Work Life Group
Get handpicked remote jobs straight to your inbox weekly.