
Senior Site Reliability Engineer – SRE
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in Spain.
• Drive Operational Excellence: Design, implement, and maintain highly available, scalable, and resilient systems that provide an outstanding customer experience.
• Datadog Expert: Serve as one of the key experts for Datadog, responsible for defining and executing best practices.
• Software Development for Reliability: Create robust, well-tested, and maintainable software to automate operational tasks.
• Toil Reduction Champion: Identify and eliminate toil through automation and process enhancements.
• Incident Management & Post-Mortems: Lead blameless post-mortems and contribute to the incident response framework.
• Reliability Metrics & Goals: Collaborate to define, implement, and monitor Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets.
• Infrastructure as Code: Utilize and contribute to infrastructure as code initiatives.
• System Design & Architecture: Provide SRE expertise during system design reviews.
• Knowledge Sharing & Mentorship: Document processes and share expertise with the team.
• Proven experience in operating and enhancing production systems at scale in an SRE, Production Engineering, or Platform Engineering capacity.
• Demonstrated ability to quickly develop accurate mental models of complex distributed systems across infrastructure, applications, networking, identity, and observability domains.
• Strong troubleshooting capabilities with a methodical, evidence-based approach to incident response and root cause analysis.
• Experience in defining, implementing, and utilizing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to inform reliability decisions.
• Exceptional written and verbal communication skills, with the capability to articulate complex technical issues clearly to both technical and non-technical audiences.
• Flexible work arrangements.
• Professional development opportunities.
• Continuous improvement culture.
• Mentorship opportunities.
N2JSoft, administrative and HR softwares
It's Prodigy
ARA
Kenlo
Get handpicked remote jobs straight to your inbox weekly.