
Site Reliability Engineer
Posted 1 hour ago

Posted 1 hour ago
• Design, construct, and sustain scalable and dependable systems on GCP (Compute Engine, GKE, Cloud Storage, Cloud SQL)
• Develop automation for infrastructure provisioning utilizing Terraform, Ansible, or Deployment Manager
• Establish and manage observability platforms (monitoring, logging, tracing) using tools like Stackdriver (Cloud Monitoring), Prometheus, or Grafana
• Oversee incident response, carry out postmortems, and implement enhancements to mitigate recurrence
• Collaborate with DevOps and engineering teams to improve CI/CD pipelines for robust deployments
• Define and track SLAs, SLOs, and SLIs to ensure application availability and performance
• Implement disaster recovery (DR) and backup strategies across cloud services
• Continuously refine performance, capacity, and cost-efficiency of GCP resources
• Bachelor's degree in Computer Science, Engineering, or a related discipline
• Over 3 years of practical experience as a Site Reliability Engineer, DevOps Engineer, Systems Engineer, or Cloud Infrastructure Engineer, with a proven history of managing production-grade systems on Google Cloud Platform (GCP) or other cloud environments
• Solid understanding of Linux/Unix system administration, networking, and troubleshooting
• Experience in implementing Infrastructure as Code (IaC) with tools such as Terraform, Ansible, or Deployment Manager
• Familiarity with containerization and orchestration technologies like Docker and Kubernetes (GKE)
• Proficient with monitoring and observability tools (Google Cloud Operations Suite, Prometheus, Grafana, Datadog, ELK)
• Experience in defining and monitoring SLAs, SLOs, and SLIs to ensure application uptime and performance
• Demonstrated ability to manage incident response, conduct postmortems, and perform root cause analysis
• Proficiency in at least one scripting language (Python, Bash, or Go) for automation and tooling, along with hands-on experience in building or managing CI/CD pipelines (Jenkins, GitLab CI, Cloud Build). Strong background in configuration management and release automation
• Knowledge of IAM (Identity and Access Management), network security, and cloud compliance controls, alongside familiarity with disaster recovery (DR), backups, and high-availability design
• High-level proficiency in written and spoken English communication
• Comprehensive and affordable medical, dental, vision, and life insurance options
• Competitive Provident Fund contributions
• Paid time off and holidays
• Mental health support and wellbeing program
• Company-provided equipment and a one-time $250 USD work from home stipend
• $750 USD annual professional development budget
• Company rewards and recognition program
• And more!
PandaDoc
PandaDoc
PandaDoc
PandaDoc
Get handpicked remote jobs straight to your inbox weekly.