This is a fully remote position, open to applicants in Brazil.

📋 Description

• Proficient in cloud computing platforms such as AWS, GCP, OCI, and/or Azure.

• In-depth understanding of Linux and systems administration.

• Experienced in containerization technologies, including Kubernetes (k8s) and Helm.

• Familiarity with infrastructure as code methodologies (e.g., Terraform, Terragrunt, CloudFormation).

• Knowledgeable in CI/CD tools, such as Jenkins, GitHub Actions, and Argo CD.

• Skilled in managing web servers like Apache and Nginx.

• Proficient in scripting languages (Shell and/or Python).

• Holds cloud certifications for AWS, Azure, OCI, and/or GCP.

• Well-versed in using Grafana.

• Previous experience working in high-scale environments.

⛳️ Requirements

• Ensure optimal availability, resilience, and performance across production and non-production environments.

• Administer, evolve, and automate cloud infrastructure (AWS, GCP, OCI, and Azure), adhering to best practices for cost efficiency, security, and scalability.

• Design, implement, and manage containerized environments utilizing Kubernetes.

• Develop, version, and maintain infrastructure as code (IaC) using tools such as Terraform, CloudFormation, or equivalent.

• Analyze, diagnose, and resolve intricate incidents (troubleshooting), focusing on root cause analysis to prevent future occurrences.

• Implement and enhance monitoring, observability, and alerting mechanisms, suggesting ongoing improvements to system reliability.

• Collaborate with development teams to integrate SRE, DevOps, and reliability engineering practices, fostering a culture of automation and quality.

• Define and monitor KPIs/SLOs/SLIs, ensuring they align with business objectives.

• Propose and spearhead continuous improvement initiatives, process automation, and reduction of operational failures.

🏝️ Benefits

• Experience with cloud computing (AWS, GCP, OCI and/or Azure).

• Strong knowledge of Linux and systems administration.

• Experience with containerization, Kubernetes (k8s) and Helm.

• Knowledge of infrastructure as code (e.g., Terraform / Terragrunt / CloudFormation).

• Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, Argo CD).

• Experience with web servers (Apache, Nginx).

• Scripting knowledge (Shell and/or Python).

• Cloud certifications (AWS, Azure, OCI and/or GCP).

• Knowledge of Grafana.

• Previous experience in high-scale environments.

Senior Site Reliability Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

DevOps Reliability Engineer

Senior Site Reliability Engineer – Network

Staff Site Reliability Engineer

DevOps Engineer, Mid Level

DevOps Engineer, Azure

DevOps Engineer, mk8s

Never miss a great job!