Remotery

Senior SRE

Posted 6 days ago

This is a fully remote position, open to applicants in Brazil.

📋 Description

• Develop and enhance a thorough observability strategy.

• Establish and manage SLIs, SLOs, and Error Budgets.

• Guarantee resilience and scalability within systems.

• Minimize incidents and avert recurrence.

• Advance operational architecture on AWS.

• Design automations and self-healing solutions.

• Serve as a technical facilitator for teams.


⛳️ Requirements

• Experience in SRE and Reliability.

• Understanding of distributed systems' resilience.

• Knowledge of self-healing (auto-recovery) mechanisms.

• Proficiency in event-driven scalability.

• Skills in incident management and conducting post-mortems.

• Comprehensive observability experience: logs, traces, and metrics.

• Ability to create custom metrics.

• Familiarity with APM (Application Performance Monitoring).

• Proficient with tools such as Datadog (or similar).

• Capable of building dashboards and visualization panels.

• Expertise in intelligent monitoring and alerting systems.

• Experience with real-time alerts regarding incidents and budgets, including team communication.

• Knowledge of synthetic testing methodologies.

• Background in reliability management.

• Ability to define and track SLI / SLO / Error Budget.

• Understanding of RTO / RPO concepts.

• A mindset focused on availability and user experience.

• Proficient in AWS (CloudWatch, X-Ray, ECS/EKS, Lambda).

• Experience with Docker and containerization.

• Knowledge of distributed architecture.

• Familiarity with Infrastructure as Code and automation practices.


🏝️ Benefits

• Health and dental insurance.

• Meal and food allowance.

• Childcare assistance.

• Extended parental leave.

• Partnerships with gyms and health & wellness professionals through Wellhub (Gympass) TotalPass.

• Profit Sharing (PLR).

• Life insurance.

• Access to a continuous learning platform (CI&T University).

• Discount club.

• Free online platform dedicated to physical and mental health and wellbeing.

• Pregnancy and responsible parenthood course.

• Partnerships with online course platforms.

• Language learning platform.

• And many more.

People also viewed

Work Life Group27 min ago

Lead DevOps Engineer, Data & AI Platform

HU flagHungary OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
accesa.eu27 min ago

DevOps Engineer, German

RO flagRomania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cisco33 min ago

Site Reliability Engineer – Kubernetes Platform

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Work Life Group40 min ago

Lead DevOps Engineer – Data & AI Platform

CZ flagCzechia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
JumpCloud40 min ago

Security Engineer, DevSecOps

MX flagMexico OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Unit440 min ago

Cloud Operations Engineer

PT flagPortugal OnlyFull-timeDevOps & Site Reliability Engineer (SRE)€30.5k – €35.1k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers