
Backend Engineer – Platform – Stacks
Posted May 25

Posted May 25
This is a fully remote position, open to applicants in Spain.
• Design, develop, and manage reconciliation systems, including the SSS backend, to monitor the desired stack state, identify and rectify drift across stack templates, grafana.com state, Hosted Grafana, and the actual configuration of customer stacks.
• Collaborate with teams across SSS, grafana.com, and deployment configurations to maintain reliable, observable, and resilient stack lifecycle workflows.
• Enhance operational efficiency by simplifying deployment processes (e.g., aiming for a single PR regional SSS deployment) and actively contributing to the Stack Config Reconciliation initiative.
• Oversee the rollout mechanisms for provisioned plugins, dashboards, data sources, Grafana versions, release channels, and stack-level configurations.
• Assist in the rollout of new regions and clusters, including the necessary operational paths to safely bring stacks online in new Grafana Cloud regions.
• Enhance incident response and recovery procedures for stack misalignment, reconciliation failures, plugin rollout challenges, and Hosted Grafana integration issues.
• Collaborate with Product, Hosted Grafana, Infrastructure, Support, and related AppCore teams on stack lifecycle tasks that impact customers.
• Contribute to roadmap planning, technical design, OnCall enhancements, and the long-term simplification of stack operations.
• You will take ownership of the production behavior of the systems you develop. This includes enhancing runbooks, dashboards, alerts, reconciliation safety, rollout controls, and recovery procedures. You should be adept at debugging across service boundaries and making careful adjustments to systems that impact customer stacks.
• You possess a minimum of 1 year of experience working fully remotely.
• You have some experience with a SaaS platform and are knowledgeable about common concepts in distributed systems (e.g., scalability, multi-tenancy, high availability).
• You have professional experience with Golang and are open to working with both backend services and application code.
• You are passionate about developer and user experience as well as the quality of the products you develop.
• You have experience participating in the delivery of projects, from initial brainstorming to final delivery to customers.
• You write clean, well-tested code that is understandable, operable, and maintainable by fellow engineers.
• You can manage well-defined tasks, break them down, and execute them iteratively to deliver functional solutions while gathering feedback.
• You are willing to cooperate with various teams and ensure that your work aligns with the needs of other squads and external stakeholders.
• You are familiar with Kubernetes on AWS, GCP, or Azure and have experience with infrastructure-as-code tools (Helm, Terraform, Jsonnet, etc.).
• You have participated in blameless incident response and contributed to post-incident reviews.
• 100% Remote, Global Culture
• Scaling Organization – Engage in meaningful work within a high-growth, dynamically evolving environment.
• Transparent Communication – Anticipate open decision-making processes and regular updates across the company.
• Innovation-Driven – Enjoy the autonomy and support to produce excellent work and explore new ideas.
• Open Source Roots – Our foundation is built on community-driven values that influence our work culture.
• Empowered Teams – We foster a high trust, low ego environment that prioritizes outcomes over appearances.
• Career Growth Pathways – Clear opportunities for career development and advancement.
• Approachable Leadership – Transparent executives who are engaged, visible, and relatable.
• Passionate People – Become part of a team of intelligent, supportive individuals who care deeply about their work.
• In-Person Onboarding – We aim for you to thrive from day one alongside your fellow new ‘Grafanistas’ to learn about our operations and culture.
• Balance is Key – We offer a global annual leave policy of 30 days per year, which includes 3 days reserved for Grafana Shutdown Days to enable the team to disconnect fully.
Confitec
DOMVS iT
Anyone AI
FCamara Consulting & Training
Get handpicked remote jobs straight to your inbox weekly.