Remotery

Senior Site Reliability Engineer

Posted 13 hours ago

This is a fully remote position, open to applicants in Philippines.

📋 Description

• Design, implement, and continuously enhance highly available, scalable, secure, and resilient cloud infrastructure and platform services.

• Define and refine Service Level Indicators (SLIs), Service Level Objectives (SLOs), and operational metrics to achieve measurable reliability outcomes.

• Lead incident response efforts, manage major incidents, perform root cause analysis, and conduct post-incident reviews with a focus on systemic improvements.

• Promote the reduction of operational toil through automation, standardization, and the development of self-healing platform capabilities.

• Develop and uphold disaster recovery, backup, failover, and resilience strategies to fulfill defined Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO).

• Conduct capacity planning, performance analysis, and proactive optimization of infrastructure and application environments.

• Architect, build, and maintain scalable cloud-native infrastructure primarily within AWS environments.

• Develop and manage infrastructure-as-code utilizing tools such as Terraform and CloudFormation.

• Create reusable platform components and shared services that enhance developer productivity and operational consistency.

• Design and maintain comprehensive observability solutions encompassing metrics, logging, tracing, alerting, and dashboards.

• Collaborate with engineering teams to integrate reliability, scalability, performance, and security considerations into the software development lifecycle (SDLC).


⛳️ Requirements

• 5+ years of experience in Site Reliability Engineering, DevOps Engineering, Platform Engineering, or similar infrastructure roles.

• Strong hands-on experience managing production workloads within AWS cloud environments.

• In-depth experience with infrastructure-as-code tools such as Terraform and/or CloudFormation.

• Significant experience in designing and supporting CI/CD pipelines and modern software delivery practices.

• Solid understanding of distributed systems, microservices architecture, networking, and cloud-native technologies.

• Experience in implementing observability and monitoring solutions across complex environments.

• Proficient in scripting and automation using Python, Bash, or comparable languages.

• Experience in managing production incidents and conducting structured root cause analyses.

• Strong grasp of system reliability, scalability, security, and operational best practices.

• Excellent analytical, troubleshooting, and problem-solving skills.

• Strong communication and stakeholder engagement abilities.

• Ability to thrive in fast-paced, agile, and collaborative engineering environments.


🏝️ Benefits

• Paid time off.

• Remote work options.

• Professional development opportunities.

People also viewed

Innovative Solutions1 hour ago

Cloud Engineer – DevOps

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$100k – $160k/year
ApplyView job
Caspar Health1 hour ago

DevSecOps/DevOps Engineer

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
IVIX1 hour ago

Deployment Engineer

US flagNew York OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Investigo11 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind11 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal11 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers