
Senior Site Reliability Engineer
Posted May 20

Posted May 20
This is a fully remote position, open to applicants in Brazil.
• Proficient in cloud computing platforms such as AWS, GCP, OCI, and/or Azure.
• In-depth understanding of Linux and systems administration.
• Experienced in containerization technologies, including Kubernetes (k8s) and Helm.
• Familiarity with infrastructure as code methodologies (e.g., Terraform, Terragrunt, CloudFormation).
• Knowledgeable in CI/CD tools, such as Jenkins, GitHub Actions, and Argo CD.
• Skilled in managing web servers like Apache and Nginx.
• Proficient in scripting languages (Shell and/or Python).
• Holds cloud certifications for AWS, Azure, OCI, and/or GCP.
• Well-versed in using Grafana.
• Previous experience working in high-scale environments.
• Ensure optimal availability, resilience, and performance across production and non-production environments.
• Administer, evolve, and automate cloud infrastructure (AWS, GCP, OCI, and Azure), adhering to best practices for cost efficiency, security, and scalability.
• Design, implement, and manage containerized environments utilizing Kubernetes.
• Develop, version, and maintain infrastructure as code (IaC) using tools such as Terraform, CloudFormation, or equivalent.
• Analyze, diagnose, and resolve intricate incidents (troubleshooting), focusing on root cause analysis to prevent future occurrences.
• Implement and enhance monitoring, observability, and alerting mechanisms, suggesting ongoing improvements to system reliability.
• Collaborate with development teams to integrate SRE, DevOps, and reliability engineering practices, fostering a culture of automation and quality.
• Define and monitor KPIs/SLOs/SLIs, ensuring they align with business objectives.
• Propose and spearhead continuous improvement initiatives, process automation, and reduction of operational failures.
• Experience with cloud computing (AWS, GCP, OCI and/or Azure).
• Strong knowledge of Linux and systems administration.
• Experience with containerization, Kubernetes (k8s) and Helm.
• Knowledge of infrastructure as code (e.g., Terraform / Terragrunt / CloudFormation).
• Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, Argo CD).
• Experience with web servers (Apache, Nginx).
• Scripting knowledge (Shell and/or Python).
• Cloud certifications (AWS, Azure, OCI and/or GCP).
• Knowledge of Grafana.
• Previous experience in high-scale environments.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.