Remotery

Senior Site Reliability Engineer

Posted May 6

This is a fully remote position, open to applicants in Illinois.

📋 Description

• Design and enhance both new and existing systems to improve performance, reliability, and scalability.

• Develop, implement, and iterate on CI/CD pipelines.

• Support the Management, Development, Design, and Deployment of microservice and containerized applications.

• Establish robust security measures in distributed systems and agents.

• Collaborate with engineers and developers to automate deployments and configurations across different platforms.

• Simplify the complexity of Observability implementation by creating scalable automation.

• Spot opportunities for improvement in observability and related processes.

• Standardize and develop alerts, notifications, and responses for monitoring tools.

• Partner with application teams to integrate Observability into daily operations.

• Take part in post-mortem analyses and provide root cause insights along with the implementation of action items.

• Advocate for DevOps best practices within the team.

• Engage in and promote Agile/Scrum methodologies.

• Contribute to the hybrid cloud production containerization service offerings.

• Design and establish standards, policies, and procedures for automation and integrations.

• Collaborate with application subject matter experts to learn our toolsets and recommend/implement new features to optimize operations.


⛳️ Requirements

• Bachelor’s Degree with 7 years of experience; Master’s Degree with 6 years of experience; PhD with 2 years of experience.

• Approach best practices for security as a necessity, not an afterthought.

• Proficient in Cloud Platform administration (AWS, GCP, Azure).

• Familiarity with the pillars of Observability.

• Experience working in high-scale environments with a strong understanding of distributed architectures.

• Knowledge of Agile and DevOps methodologies.

• Familiarity with CI/CD tools (GitHub Actions, Bamboo, Jenkins, Azure DevOps).

• Experience managing Docker workloads using orchestration tools (Kubernetes / Amazon ECS).

• Ability to work independently and collaboratively for daily tasks.

• Eagerness to learn new concepts and processes swiftly, adapting to a dynamic environment.

• Comfortable administering Linux and Windows environments.

• Preferred: Experience with SPIRE/SPIFFE.

• Direct experience with Terraform and Crossplane.

• Proficient in development tools and scripting languages (Git / Mercurial / Subversion; Python / Elixir / Go).

• Capable of integrating MCP Servers with authorization controls.

• Knowledge of database management systems (NoSQL, Relational Databases, and relevant query languages).

• AWS Cloud Practitioner or Azure AZ-900 Certification is desirable.

• Extensive experience in the design and implementation of serverless architecture solutions.

• Proven experience in deploying containerized applications (Kubernetes, etc.).

• Familiarity with data management and pipeline technologies (Apache Storm, Kafka, Flink, Spark, Hadoop, etc.).

• Previous experience working in an Agile team.

• Strong understanding of observability solutions using OpenTelemetry, Prometheus/Grafana, or similar applications.

• Excellent understanding of distributed system architectures and telemetry.

• Significant experience in deploying and managing large Kubernetes distributed platforms.

• Proficient in GitOps practices and Infrastructure as Code systems (such as Terraform, ArgoCD, Helm).


🏝️ Benefits

• Paid time off (vacation, holidays, sick leave).

• Medical, dental, and vision insurance.

• 401(k) plan for eligible employees.

• Short-term incentive programs.

People also viewed

Investigo9 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind9 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal9 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job
Avaya9 hours ago

Site Reliability Engineer – Azure, DevSecOps, IaC, Governance, Observability

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$129k – $143k/year
ApplyView job
Agilent Technologies9 hours ago

DevOps Engineer – Platform, AWS, CI/CD

US flagColorado OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$143.8k – $224.6k/year
ApplyView job
Dropbox9 hours ago

Site Reliability Engineer

PL flagPoland OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers