
Site Reliability Engineer III
Posted May 24

Posted May 24
This is a fully remote position, open to applicants in India.
• Deploy, oversee, and scale distributed platforms across various geographic locations
• Design and sustain Kubernetes-based infrastructure for extensive applications
• Create and manage Helm charts to facilitate efficient and repeatable deployments
• Monitor system health utilizing Grafana dashboards and metrics; proactively identify and resolve issues
• Enhance system reliability, performance, and scalability through automation and adherence to best practices
• Manage large-scale deployments and enhance infrastructure to support growth
• Collaborate with development teams to ensure seamless CI/CD processes and production readiness
• Implement observability, alerting, and incident response protocols
• Diagnose production issues and conduct root cause analysis
• Document and maintain run books for incident response
• 4–5 years of experience in Site Reliability Engineering, DevOps, or related fields
• Extensive hands-on experience with Kubernetes in production settings
• Proficient experience with infrastructure as code (Terraform, git, etc.)
• Strong expertise in AWS (EKS, VPC, S3, ECR, IAM roles, etc.)
• Solid experience with Helm charts for application deployment
• Proficient in bash scripting and tooling
• Experience with large-scale distributed systems and high-availability architectures
• Strong understanding of containerization, microservices, and cloud-native ecosystems
• Familiarity with CI/CD pipelines and automation tools
• Strong debugging and problem-solving skills in production environments
• Competitive compensation
• Comprehensive benefits
• Career success on your terms
• Flexible work environment
• Annual wellness and community outreach days
• Continuous recognition for your contributions
• Global collaboration and networking opportunities
Akka (formerly Lightbend)
Swimlane
Get handpicked remote jobs straight to your inbox weekly.