
Senior Site Reliability Engineer
Posted May 21

Posted May 21
This is a fully remote position, open to applicants in Malaysia.
• Oversee, sustain, and enhance the reliability, availability, and performance of production systems and services.
• Develop and uphold infrastructure as code (IaC), deployment pipelines, and automation to facilitate continuous delivery, scalability, and disaster recovery.
• Address incidents, conduct root-cause analyses, and lead postmortems to ensure that lessons learned are implemented.
• Apply and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
• Collaborate with Engineering, Product, Compliance, and Operations teams to guarantee that infrastructure adheres to reliability, compliance, and security standards.
• Assist with service scaling, database operations, cloud infrastructure (preferably GCP), networking, and microservices orchestration.
• Create documentation for operational runbooks, on-call procedures, and system architecture to aid maintenance, knowledge sharing, and compliance.
• Proficient programming or scripting abilities (Go, Python, Bash, or similar) for automation, tooling, and operational tasks.
• Practical experience with cloud infrastructure, particularly Google Cloud Platform (GCP).
• Knowledge of containerization and orchestration (Docker, Kubernetes, or equivalent).
• Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar).
• Familiarity with either FluxCD or ArgoCD for GitOps-based delivery.
• Strong grasp of distributed systems, microservices architecture, and reliability patterns.
• Experience in setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing).
• Excellent troubleshooting skills and the ability to respond to incidents under pressure.
• Understanding of backup and disaster recovery strategies, database management, and secure operations.
• Ownership mindset: proactive, responsible, and dedicated to system reliability.
• Strong communication skills — capable of coordinating with both technical and non-technical stakeholders.
• Comfortable working in a fast-paced, early-stage startup environment.
• High integrity, attention to detail, and a passion for fintech and programmable banking systems.
• Competitive salary and meaningful equity with opportunities for growth.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.