
Senior Site Reliability Engineer
Posted May 24

Posted May 24
This is a fully remote position, open to applicants in Italy.
• Engage in hands-on Reliability & System Engineering by designing, constructing, and managing reliable and scalable systems, defining and monitoring SLOs/SLIs, directly working on production infrastructure, and collaborating closely with software engineers to enhance system design and reliability.
• Focus on Automation, Operations & Incident Response by developing automation for infrastructure and operational workflows to minimize toil and reduce MTTR. Participate in and lead incident responses, as well as conduct blameless post-incident reviews with clear follow-ups implemented in code and tooling.
• Analyze and optimize system performance and cost under the Performance, Capacity & Security domain, providing data, insights, and recommendations for capacity planning, while supporting security best practices through direct involvement in vulnerability remediation and threat mitigation.
• Possess hands-on experience with SRE practices in production environments, showcasing strong expertise in AWS, Kubernetes, networking, DNS, and Infrastructure as Code (with a preference for Pulumi and knowledge of Terraform being a plus).
• Exhibit a strong foundation in Automation & Software Engineering, emphasizing code quality and maintainability, including proficiency in Python and in-depth knowledge of the Python ecosystem (testing, debugging, packaging), along with a consistent focus on crafting clean, well-structured, and maintainable code.
• Demonstrate skills in Reliability, Data & Operations by engaging stakeholders, mentoring others, leading incident responses and root cause analyses (RCAs), enhancing system reliability, and proposing solutions while sharing insights.
• Nice-to-Have: Experience in highly regulated industries (such as Insurance, Banking, Healthcare), managing sensitive data, and supporting secure networking configurations, with familiarity in security technologies like Cloudflare.
• Have a solid understanding of microservices architectures, including their principles and trade-offs.
• Gain hands-on experience with Datadog for platform and application monitoring, performance optimization, and a strong foundation in database structures.
• Work Your Way: Enjoy full flexibility – work from home, the office, or a combination of both. Additionally, work from anywhere for up to 30 days each year.
• Grow with us: Access learning resources, mentorship, and a personalized growth plan tailored to your development.
• Thrive and perform: Benefit from private healthcare, gym discounts, wellbeing programs, and mental health support.
Work Life Group
accesa.eu
Cisco
Work Life Group
Get handpicked remote jobs straight to your inbox weekly.