
Staff Site Reliability Engineer – Production Engineering
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in Canada.
• Develop and enhance Dropbox’s comprehensive technical reliability strategy to accommodate the evolving engineering landscape shaped by AI-driven and autonomous software development.
• Establish multi-year reliability objectives, standards, and plans encompassing observability, debugging, incident management, service health, and operational readiness.
• Oversee cross-team projects that minimize reliability risks as software delivery speed, pull request frequency, service complexity, and incident occurrences grow.
• Collaborate with engineering leaders and platform teams to enhance monitoring, alerting, debugging, SLOs, SLAs, and incident response frameworks at a company-wide level.
• Recognize emerging reliability challenges presented by AI-enabled development processes and design scalable systems, procedures, and safeguards to address them.
• Offer technical guidance and mentorship to engineers across various teams, enhancing engineering quality, reliability insight, and operational excellence.
• Foster clear communication and alignment with senior stakeholders regarding reliability priorities, trade-offs, risks, and progress in execution.
• Bachelor’s degree in Computer Science or a related technical discipline involving programming (e.g., physics or mathematics), or equivalent technical expertise.
• Over 12 years of experience in software engineering, site reliability engineering, infrastructure engineering, or similar technical positions.
• Demonstrated capability to define and implement multi-year, multi-team reliability, infrastructure, or platform strategies that yield measurable business and customer benefits.
• Extensive experience with distributed systems, production operations, observability, incident response, SLOs/SLAs, debugging, and reliability risk management.
• Proven ability to identify complex technical issues, debug production systems, automate operational workflows, and create resilient software components.
• Experience in influencing engineering roadmaps across various teams and making technical decisions that benefit the wider engineering organization.
• Excellent communication and collaboration skills, with the capacity to align cross-functional stakeholders amid uncertainty and drive execution across teams.
• Health insurance
• Retirement plans
• Paid time off
• Flexible work arrangements
• Professional development opportunities
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.