Remotery

Senior Site Reliability Engineer – Platform & Agentic Operations

Posted 6 days ago

This is a fully remote position, open to applicants in Germany.

📋 Description

• Implement and enhance monitoring, alerting, and incident response systems and processes to guarantee high reliability for our customers and adhere to defined SLOs.

• Design, construct, and sustain resilient, scalable infrastructure by utilizing SRE principles and best practices.

• Participate in post-incident reviews, identify patterns, and contribute to ongoing improvement initiatives.

• Perform performance testing, analyze system bottlenecks, and devise strategies for capacity planning to ensure our systems effectively meet current and future demands.

• Develop systems where CI/CD test failures provide immediate, real-time context for agents, enabling them to analyze logs, trace dependencies, and suggest or implement immediate code fixes.


⛳️ Requirements

• Over 6 years of experience in SRE, DevOps, or Platform Engineering.

• Strong comprehension and practical implementation of Site Reliability Engineering (SRE) principles, methodologies, and best practices.

• Proficient in programming/scripting languages such as Python, GoLang, or TypeScript.

• Practical knowledge of integrating LLMs into automated workflows, capable of providing live system state (such as a recent CI test failure) to an agent as actionable context.

• Previous experience in incident management, conducting post-incident reviews, and executing improvements to avert future incidents.

• Ability to systematically and effectively troubleshoot complex technical issues.

• Solid experience with a public cloud provider, preferably Google Cloud Platform (GCP), along with a strong understanding of its observability services.

• A proactive mindset for identifying problems, opportunities for enhancement, and performance bottlenecks.

• Excellent communication skills to articulate technical concepts and collaborate efficiently with diverse teams.

• Proficient in spoken and written English, with German being a plus.

• Residency in Germany.


🏝️ Benefits

• Join an international, dynamic, and highly motivated team that has a proven track record of making things happen.

• Your contributions will accelerate the "energy transition," directly impacting our climate.

• Collaborate with and learn from exceptionally talented colleagues.

• Enjoy direct access to key decision-makers.

• Seize the best opportunities for a full-time position in one of Europe’s most successful scaleups.

• Work remotely across Germany, with offices located in Hamburg, Berlin, or Munich.

• Achieve a healthy work-life balance while enjoying all the advantages of the EGYM Wellpass.

• Access benefits and discounts through Futurebens.

• Enjoy flexibility with our job bike leasing program, whether you prefer a city bike or an e-bike, while contributing positively to the environment.

People also viewed

Advanced Solutions International, Inc.10 hours ago

DevOps Reliability Engineer

AU flagAustralia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$90k – $110k/year
ApplyView job
Stone10 hours ago

Senior Site Reliability Engineer – Network

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Replit1 day ago

Staff Site Reliability Engineer

EuropeFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Soum1 day ago

DevOps Engineer, Mid Level

EG flagEgypt OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Lakeside Software1 day ago

DevOps Engineer, Azure

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Interval Group1 day ago

DevOps Engineer, mk8s

DE flagGermany OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers