Remotery

Senior Site Reliability Engineer, SRE

Posted 1 day ago

This is a fully remote position, open to applicants in Brazil.

📋 Description

• Design, implement, and enhance Site Reliability Engineering practices within production environments.

• Define, oversee, and continually enhance Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.

• Lead and engage in incident response and command processes.

• Develop and refine observability strategies, including monitoring, logging, alerting, and distributed tracing.

• Enhance system reliability, availability, scalability, and operational efficiency.

• Collaborate with engineering teams to boost application performance and readiness for production.

• Create automation solutions that minimize operational overhead and enhance reliability.

• Engage in root cause analysis and conduct post-incident reviews.

• Propel continuous improvement initiatives grounded in operational insights and lessons learned from incidents.

• Assist in establishing reliability best practices across teams and services.


⛳️ Requirements

• Over 5 years of professional experience in Site Reliability Engineering, DevOps, or Production Engineering roles.

• Solid understanding of Site Reliability Engineering principles and best practices.

• Experience in supporting and managing production systems at scale.

• Strong knowledge of monitoring, observability, and reliability engineering concepts.

• Experience in cloud-based environments.

• Excellent troubleshooting and problem-solving abilities.

• Experience with distributed systems and contemporary application architectures.

• Proven track record in Site Reliability Engineering.

• Experience in defining and managing:

• Service Level Objectives (SLOs)

• Service Level Indicators (SLIs)

• Error Budgets

• Experience in leading or actively participating in Incident Command and Incident Response processes.

• Experience in designing and implementing observability strategies.

• Hands-on experience with:

• Monitoring

• Logging

• Alerting

• Distributed Tracing

• Experience in enhancing system reliability, availability, and operational excellence.

• Experience in supporting mission-critical production environments.

• Familiarity with cloud platforms (AWS preferred).

• Strong automation mindset.

• Experience in conducting root cause analysis and postmortems.

• Experience with Kubernetes.

• Experience with Terraform or Infrastructure as Code.

• CI/CD pipeline experience.

• Familiarity with containerized environments.

• Experience with distributed microservices architectures.

• Background in performance engineering.

• Experience mentoring engineers on reliability practices.

• Multi-cloud experience.

• Experience in highly regulated or high-availability environments.


🏝️ Benefits

• Home office option;

• Competitive compensation based on experience;

• Career development plans to support significant growth within the company;

• Opportunities to work on international projects;

• Oowlish English Program (Technical and Conversational);

• Oowlish Fitness with Total Pass;

• Engaging games and competitions;

People also viewed

Investigo9 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind9 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal9 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job
Avaya9 hours ago

Site Reliability Engineer – Azure, DevSecOps, IaC, Governance, Observability

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$129k – $143k/year
ApplyView job
Agilent Technologies9 hours ago

DevOps Engineer – Platform, AWS, CI/CD

US flagColorado OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$143.8k – $224.6k/year
ApplyView job
Dropbox9 hours ago

Site Reliability Engineer

PL flagPoland OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers