Remotery

DevOps Reliability Engineer

Posted 10 hours ago

This is a fully remote position, open to applicants in Australia.

📋 Description

• Oversee and enhance the health, availability, performance, and cost efficiency of production systems hosted on Azure.

• Utilize telemetry from applications, databases, and infrastructure to pinpoint performance challenges, bottlenecks, and reliability threats.

• Adjust Azure services and platform settings to optimize performance, resilience, and resource utilization.

• Collaborate with engineering teams to propose and execute practical, data-driven enhancements to reliability, scalability, and operational efficiency.

• Develop and maintain operational documentation, runbooks, and troubleshooting guides to facilitate consistent incident response and ongoing operations.

• Assist Tech Support and Sustained Engineering by executing authorized SQL queries and conducting database backups and restores for troubleshooting purposes.

• Assess how partner integrations and customer usage trends influence system performance and cloud expenditure.

• Explore intricate production issues, perform root cause analysis, and lead the resolution of reliability and performance challenges.

• Contribute to the ongoing enhancement of deployment processes, system stability, and operational readiness.

• Carry out other job-related tasks and responsibilities as assigned.


⛳️ Requirements

• Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent experience.

• 8+ years of experience in DevOps, Site Reliability Engineering, Cloud Engineering, or comparable roles.

• Extensive hands-on experience with Microsoft Azure, particularly Azure SQL, Azure Functions, Azure App Services, and Azure Containers (AKS, Container Apps, or similar).

• Proficient in reading and interpreting telemetry, logs, metrics, and resource usage data, and articulating issues along with solutions.

• Experience with production systems that demand high availability and reliability.

• Comfortable managing work from start to finish, from identifying issues to implementing improvements.

• Familiarity with adjusting pipelines, hosting configurations, and deployment workflows.

• Strong understanding of cloud cost drivers and optimization of usage.

• Excellent problem-solving abilities and the capacity to work collaboratively with engineering and support teams.

• Capability to read and interpret application code to assist with troubleshooting, root cause analysis, and identifying opportunities for performance enhancements.


🏝️ Benefits

• Wellness Benefits

• Opportunities for Professional Growth and Development

• Flexible Remote Work

• Volunteer Time Off

• Study Leave

• Employee Assistance Program

People also viewed

Stone10 hours ago

Senior Site Reliability Engineer – Network

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Replit1 day ago

Staff Site Reliability Engineer

EuropeFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Soum1 day ago

DevOps Engineer, Mid Level

EG flagEgypt OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Lakeside Software1 day ago

DevOps Engineer, Azure

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Interval Group1 day ago

DevOps Engineer, mk8s

DE flagGermany OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job
Compass2 days ago

Senior Cloud DevOps Engineer

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers