This is a fully remote position, open to applicants in Philippines.

📋 Description

• Design, implement, and maintain a highly available, scalable, and secure cloud infrastructure.

• Monitor production systems while proactively identifying issues related to reliability, performance, and capacity.

• Lead activities related to incident response, conduct root cause analysis (RCA), and facilitate post-incident reviews.

• Develop automation solutions aimed at reducing operational overhead and enhancing system reliability.

• Build and improve observability platforms, which include monitoring, logging, tracing, and alerting systems.

• Establish and uphold Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.

• Collaborate with development teams to enhance application resilience, deployment strategies, and overall operational readiness.

• Support CI/CD pipelines and initiatives for deployment automation.

• Participate in on-call rotations and engage in production support activities.

• Promote best practices for infrastructure-as-code and platform engineering.

• Ensure adherence to security, governance, and fintech industry standards.

• Contribute to strategies for disaster recovery, business continuity, and high availability.

⛳️ Requirements

• Candidates must be located in the APAC region and available to work during Canadian overnight hours.

• Proficient in English.

• Minimum of 5 years of experience in Site Reliability Engineering, DevOps, Cloud Engineering, Platform Engineering, or similar roles.

• Extensive experience supporting mission-critical production environments.

• Familiarity with cloud platforms such as AWS, Azure, or GCP.

• Hands-on expertise with Kubernetes and containerized environments.

• Strong understanding of Infrastructure as Code, preferably with Terraform.

• Experience in building and maintaining CI/CD pipelines.

• Proficient in scripting and automation using Python, Bash, or Go.

• Solid understanding of Linux systems administration.

• Experience with monitoring and observability tools like Prometheus, Grafana, Datadog, New Relic, Splunk, ELK, or OpenTelemetry.

• Familiarity with networking fundamentals including DNS, load balancing, TLS/SSL, VPNs, and firewalls.

• Proven experience managing production incidents and performing root cause analysis.

• Strong communication skills with the ability to collaborate across distributed teams.

🏝️ Benefits

• Work from anywhere with genuine flexibility and freedom.

• Earn in USD with a compensation package that reflects your expertise.

• Recharge with confidence thanks to dedicated paid time off.

• Advance your career through fully covered international certifications.

• Enjoy access to coworking spaces worldwide whenever you need a professional setup.

• Enhance your English skills while broadening your global reach.

• Connect and engage in enjoyable activities that bring our international team together.

• Feel valued with personalized gifts and a thoughtful welcome kit.

• Contribute to our community and earn through our referral program.

Senior Site Reliability Engineer (SRE) – APAC

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

DevOps Reliability Engineer

Senior Site Reliability Engineer – Network

Staff Site Reliability Engineer

DevOps Engineer, Mid Level

DevOps Engineer, Azure

DevOps Engineer, mk8s

Never miss a great job!