Remotery

SRE – AWS/GCP

Posted May 25

This is a fully remote position, open to applicants in Brazil.

📋 Description

• Ensure the availability and reliability of production systems;

• Implement and uphold monitoring, observability, and alerting solutions;

• Address incidents, conduct root cause analysis (RCA), and establish remediation plans;

• Automate operational tasks and repetitive processes (Infrastructure as Code);

• Collaborate with CI/CD pipelines for secure and continuous deployment;

• Manage and enhance cloud environments (AWS and GCP);

• Apply best practices for resilience, scalability, and fault tolerance;

• Define and monitor SLIs, SLOs, and SLAs;

• Support development teams in creating more resilient applications;

• Conduct capacity planning and cost optimization (basic FinOps);

• Document processes, architectures, and operational playbooks.


⛳️ Requirements

• Experience with cloud environments, particularly AWS and/or GCP;

• Proficiency in Linux/Unix systems;

• Experience with monitoring tools (e.g., Prometheus, Grafana, CloudWatch, Stackdriver);

• Understanding of containers and orchestration (Docker and Kubernetes);

• Experience with Infrastructure as Code (Terraform, CloudFormation, or similar);

• Automation expertise with languages such as Python, Bash, or Go;

• Familiarity with CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, etc.);

• Networking knowledge (VPC, DNS, load balancing);

• Awareness of cloud security (IAM, access policies, best practices);

• Experience with multi-cloud environments;

• Understanding of DevOps practices and agile culture;

• Proficiency with distributed observability tools (OpenTelemetry, Datadog, New Relic);

• Knowledge of service mesh (Istio, Linkerd);

• Experience with messaging systems (Kafka, Pub/Sub, SQS);

• Familiarity with Chaos Engineering practices;

• Understanding of FinOps (cloud cost management).


🏝️ Benefits

• Position is also available for candidates with disabilities (PwD).

People also viewed

Work Life Group24 min ago

Lead DevOps Engineer, Data & AI Platform

HU flagHungary OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
accesa.eu24 min ago

DevOps Engineer, German

RO flagRomania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cisco31 min ago

Site Reliability Engineer – Kubernetes Platform

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Work Life Group38 min ago

Lead DevOps Engineer – Data & AI Platform

CZ flagCzechia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
JumpCloud38 min ago

Security Engineer, DevSecOps

MX flagMexico OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Unit438 min ago

Cloud Operations Engineer

PT flagPortugal OnlyFull-timeDevOps & Site Reliability Engineer (SRE)€30.5k – €35.1k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers