Remotery

DevOps Reliability Engineer

Posted Jun 20

This is a fully remote position, open to applicants in United Kingdom.

📋 Description

• Oversee and enhance the health, availability, performance, and cost-effectiveness of production systems based on Azure.

• Utilize application, database, and infrastructure telemetry to pinpoint performance challenges, bottlenecks, and reliability threats.

• Optimize Azure services and platform settings to achieve maximum performance, resilience, and resource efficiency.

• Collaborate with engineering teams to suggest and implement actionable, data-informed enhancements to reliability, scalability, and operational efficiency.

• Develop and uphold operational documentation, runbooks, and troubleshooting guides to ensure consistent incident response and ongoing operations.

• Assist Tech Support and Sustained Engineering by running approved SQL queries and performing database backups and restorations for troubleshooting purposes.

• Evaluate the impact of partner integrations and customer usage patterns on system performance and cloud expenditures.

• Investigate intricate production issues, conduct root cause analysis, and facilitate the resolution of reliability and performance challenges.

• Contribute to continuous enhancement in deployment procedures, system stability, and operational preparedness.

• Carry out additional job-related tasks and responsibilities as assigned.


⛳️ Requirements

• Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent experience.

• Over 8 years of experience in DevOps, Site Reliability Engineering, Cloud Engineering, or comparable positions.

• Extensive hands-on experience with Microsoft Azure, particularly with: Azure SQL, Azure Functions, Azure App Services, and Azure Containers (AKS, Container Apps, or similar).

• Proficient in reading and interpreting telemetry, logs, metrics, and resource usage data, with the ability to diagnose issues and propose solutions.

• Experience working with production systems that demand high availability and reliability.

• Comfortable managing tasks from start to finish, from issue identification to implementing improvements.

• Familiarity with adjusting pipelines, hosting configurations, and deployment workflows.

• Solid understanding of cloud cost determinants and strategies for usage optimization.

• Strong problem-solving abilities and the capability to collaborate effectively with both engineering and support teams.

• Ability to read and interpret application code to aid in troubleshooting, root cause analysis, and the identification of performance enhancement opportunities.


🏝️ Benefits

• Wellness Benefits

• Opportunities for Professional Growth and Development

• Flexible Remote Work

• Volunteer Time Off

• Study Leave

• Employee Assistance Program

People also viewed

Investigo9 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind9 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal9 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job
Avaya9 hours ago

Site Reliability Engineer – Azure, DevSecOps, IaC, Governance, Observability

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$129k – $143k/year
ApplyView job
Agilent Technologies9 hours ago

DevOps Engineer – Platform, AWS, CI/CD

US flagColorado OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$143.8k – $224.6k/year
ApplyView job
Dropbox9 hours ago

Site Reliability Engineer

PL flagPoland OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers