Remotery

Senior Site Reliability Engineer

Posted May 21

This is a fully remote position, open to applicants in Malaysia.

📋 Description

• Oversee, sustain, and enhance the reliability, availability, and performance of production systems and services.

• Develop and uphold infrastructure as code (IaC), deployment pipelines, and automation to facilitate continuous delivery, scalability, and disaster recovery.

• Address incidents, conduct root-cause analyses, and lead postmortems to ensure that lessons learned are implemented.

• Apply and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.

• Collaborate with Engineering, Product, Compliance, and Operations teams to guarantee that infrastructure adheres to reliability, compliance, and security standards.

• Assist with service scaling, database operations, cloud infrastructure (preferably GCP), networking, and microservices orchestration.

• Create documentation for operational runbooks, on-call procedures, and system architecture to aid maintenance, knowledge sharing, and compliance.


⛳️ Requirements

• Proficient programming or scripting abilities (Go, Python, Bash, or similar) for automation, tooling, and operational tasks.

• Practical experience with cloud infrastructure, particularly Google Cloud Platform (GCP).

• Knowledge of containerization and orchestration (Docker, Kubernetes, or equivalent).

• Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar).

• Familiarity with either FluxCD or ArgoCD for GitOps-based delivery.

• Strong grasp of distributed systems, microservices architecture, and reliability patterns.

• Experience in setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing).

• Excellent troubleshooting skills and the ability to respond to incidents under pressure.

• Understanding of backup and disaster recovery strategies, database management, and secure operations.

• Ownership mindset: proactive, responsible, and dedicated to system reliability.

• Strong communication skills — capable of coordinating with both technical and non-technical stakeholders.

• Comfortable working in a fast-paced, early-stage startup environment.

• High integrity, attention to detail, and a passion for fintech and programmable banking systems.


🏝️ Benefits

• Competitive salary and meaningful equity with opportunities for growth.

People also viewed

Advanced Solutions International, Inc.12 hours ago

DevOps Reliability Engineer

AU flagAustralia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$90k – $110k/year
ApplyView job
Stone12 hours ago

Senior Site Reliability Engineer – Network

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Replit1 day ago

Staff Site Reliability Engineer

EuropeFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Soum1 day ago

DevOps Engineer, Mid Level

EG flagEgypt OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Lakeside Software1 day ago

DevOps Engineer, Azure

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Interval Group1 day ago

DevOps Engineer, mk8s

DE flagGermany OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers