Remotery

Site Reliability Engineer – Engineering Productivity

Posted 1 day ago

📋 Description

• Develop, deploy safely and incrementally, and manage critical production systems with a focus on scalability, reliability, observability, performance, and security.

• Oversee, support, and enhance the developer experience across various services.

• Create automation to eliminate repetitive tasks and efficiently manage production systems.

• Actively monitor, respond to, and improve alerts, as well as establish automated alert handling.

• Develop and maintain incident response runbooks.

• Assess platform and infrastructure issues and assist Arista software engineers with their assessments.

• Collaborate with third-party vendor support.

• Produce postmortem documents and devise solutions to prevent incident recurrence.

• Plan and communicate maintenance schedules for production systems.

• Partner with Arista’s product development teams to identify infrastructural challenges that create bottlenecks and limitations in their workflows.

• Design and implement solutions to address these challenges.

• Research and adopt best practices related to infrastructure and platforms to ensure secure, scalable, and fault-tolerant systems.

• Analyze the design and adequate implementation details of open-source systems for improved triage and resolution of issues.


⛳️ Requirements

• A minimum of a BSc in Computer Science or Engineering with 3 years of experience, or an MS in Computer Science or Engineering with 3 years of experience, or equivalent professional experience.

• Proficiency in one or more of Go, Python, or shell scripting to develop medium complexity automation workflows.

• Understanding of Linux (or UNIX) from an administrative and debugging standpoint.

• Practical experience in managing software systems (infrastructure, complex applications, etc.) at scale.

• Experience in server provisioning, especially from storage and networking perspectives.

• Strong analytical and software troubleshooting capabilities.

• Familiarity with infrastructure-as-code practices.


🏝️ Benefits

• Health insurance

• Flexible work arrangements

• Professional development opportunities

• Paid time off

People also viewed

Arctiq19 hours ago

Site Reliability Engineer

US flagVirginia OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job
Arctiq19 hours ago

Senior Site Reliability Engineer

US flagVirginia OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind19 hours ago

Senior DevOps Manager, German speaking

PL flagPoland OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Mediastream19 hours ago

DevOps Engineer

RO flagRomania OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Kyndryl19 hours ago

Site Reliability Engineer

US flagOhio OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$161.5k – $290.8k/year
ApplyView job
Guidehouse19 hours ago

Senior Azure DevOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$118k – $196k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers