
Senior Site Reliability Engineer (SRE) – APAC
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Philippines.
• Design, implement, and maintain a highly available, scalable, and secure cloud infrastructure.
• Monitor production systems while proactively identifying issues related to reliability, performance, and capacity.
• Lead activities related to incident response, conduct root cause analysis (RCA), and facilitate post-incident reviews.
• Develop automation solutions aimed at reducing operational overhead and enhancing system reliability.
• Build and improve observability platforms, which include monitoring, logging, tracing, and alerting systems.
• Establish and uphold Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
• Collaborate with development teams to enhance application resilience, deployment strategies, and overall operational readiness.
• Support CI/CD pipelines and initiatives for deployment automation.
• Participate in on-call rotations and engage in production support activities.
• Promote best practices for infrastructure-as-code and platform engineering.
• Ensure adherence to security, governance, and fintech industry standards.
• Contribute to strategies for disaster recovery, business continuity, and high availability.
• Candidates must be located in the APAC region and available to work during Canadian overnight hours.
• Proficient in English.
• Minimum of 5 years of experience in Site Reliability Engineering, DevOps, Cloud Engineering, Platform Engineering, or similar roles.
• Extensive experience supporting mission-critical production environments.
• Familiarity with cloud platforms such as AWS, Azure, or GCP.
• Hands-on expertise with Kubernetes and containerized environments.
• Strong understanding of Infrastructure as Code, preferably with Terraform.
• Experience in building and maintaining CI/CD pipelines.
• Proficient in scripting and automation using Python, Bash, or Go.
• Solid understanding of Linux systems administration.
• Experience with monitoring and observability tools like Prometheus, Grafana, Datadog, New Relic, Splunk, ELK, or OpenTelemetry.
• Familiarity with networking fundamentals including DNS, load balancing, TLS/SSL, VPNs, and firewalls.
• Proven experience managing production incidents and performing root cause analysis.
• Strong communication skills with the ability to collaborate across distributed teams.
• Work from anywhere with genuine flexibility and freedom.
• Earn in USD with a compensation package that reflects your expertise.
• Recharge with confidence thanks to dedicated paid time off.
• Advance your career through fully covered international certifications.
• Enjoy access to coworking spaces worldwide whenever you need a professional setup.
• Enhance your English skills while broadening your global reach.
• Connect and engage in enjoyable activities that bring our international team together.
• Feel valued with personalized gifts and a thoughtful welcome kit.
• Contribute to our community and earn through our referral program.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.