Remotery

Senior Site Reliability Engineer

Posted 1 day ago

This is a fully remote position, open to applicants in New Mexico.

📋 Description

• Collaborate with software developers, platform engineers, and IT personnel to enhance system design, operational efficiency, deployment safety, and readiness for production support.

• Establish and uphold operational standards, create runbooks, support procedures, escalation pathways, and service-level objectives.

• Assess system architecture and modifications to ensure they meet functional requirements while balancing service quality, reliability, security, and compliance needs.

• Promote ongoing improvements in platform stability, maintenance, and availability.

• Deliver advanced technical support and troubleshooting for intricate platform and service issues impacting internal users and stakeholders.


⛳️ Requirements

• Over 8 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, Systems Engineering, or similar infrastructure roles that support production services.

• Extensive experience in Linux systems administration and troubleshooting within enterprise environments.

• Significant experience in operating and maintaining on-prem Kubernetes platforms and all related components, including CRI, CNI, and CSI plugins.

• Proficient in deploying and managing applications on Kubernetes using tools such as Helm, Kustomize, and others.

• Familiarity with DevOps tools like GitLab, Artifactory, Jira, and Confluence.

• Experience with GitOps tools such as FluxCD or ArgoCD.

• Skilled in scripting with at least one of the following: Python, Go, or Bash.

• Strong background in designing, maintaining, and enhancing observability tools including monitoring, dashboards, logging, tracing, and supporting SLOs.

• Solid understanding of reliability engineering principles: service health indicators, high availability design, failure reduction and testing, operational readiness practices, including documentation development, runbooks, and architectural descriptions, incident response, root cause analysis, and remediation/recovery.

• Capability to obtain a security clearance, which necessitates U.S. citizenship.


🏝️ Benefits

• Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities

People also viewed

N2JSoft, administrative and HR softwares12 hours ago

DevOps confirmé

FR flagFrance OnlyFull-timeDevOps & Site Reliability Engineer (SRE)€60k/year
ApplyView job
It's Prodigy13 hours ago

DevOps Engineer, Cloud

Anywhere in the WorldFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Kenlo2 days ago

Analista de Infraestrutura, SRE, DevOps

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Ad Hoc LLC2 days ago

Senior Site Reliability Engineer

North AmericaFull-timeDevOps & Site Reliability Engineer (SRE)$135k – $150k/year
ApplyView job
Assured3 days ago

Staff Database Reliability Engineer, DBRE

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$165k – $185k/year
ApplyView job
SHOP APOTHEKE EUROPE3 days ago

Senior DevSecOps Engineer

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers