
Platform Operations Engineer
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in United States.
• Assist in managing and scaling our container ecosystem on Amazon EKS, implement GitOps workflows utilizing ArgoCD, and oversee CI/CD pipelines via GitHub Actions to guarantee swift, consistent, and automated deployments.
• Focus on Reliability - Establish and monitor SLIs and SLOs, lead incident response efforts including on-call rotations, root cause analysis, and post-mortems, while contributing to disaster recovery strategies to maintain high system availability.
• Enhance Observability - Design and sustain our monitoring and logging framework using Datadog, Sentry, and CloudWatch to provide engineering teams with clear insights into system health and performance before issues affect users.
• Influence the Platform's Future - Collaborate on architectural choices, develop internal tools and self-service workflows that simplify platform operations, and significantly contribute to the scaling and evolution of our infrastructure.
• 3+ years of experience in Site Reliability Engineering (SRE), DevOps, or Cloud Infrastructure.
• Proficient in core AWS services (VPC, IAM, EKS, RDS) with a solid understanding of cloud networking and security best practices.
• Expertise in Infrastructure as Code using Terraform, CloudFormation, or Crossplane.
• Skilled in GitHub and GitHub Actions as a fundamental part of your CI/CD and automation processes, not just for version control.
• Experience operating Kubernetes clusters in production and managing application deployments via GitOps workflows (ArgoCD/Flux) and Helm Charts.
• Knowledgeable in observability tools such as Datadog, Sentry, CloudWatch, and Grafana, including the creation of alerts, dashboards, and log pipelines.
• Ability to write robust Python scripts for system integration, automation of infrastructure tasks, or managing custom workflows.
• Comfortable working autonomously in a remote environment, asking questions when necessary, and maintaining progress without micromanagement.
• Bachelor’s degree in Computer Science, Engineering, or a related field.
• Preferred: Certifications in AWS, Kubernetes, Terraform, or Python.
• Competitive salary with equity options.
• Excellent health care plan choices (Medical, Dental & Vision), along with FSA, DCFSA, & HSA options.
• Company-sponsored disability and life insurance.
• Unlimited paid time off (PTO).
• 401(k) with 4% matching.
• Fully remote work with flexible working hours.
• $750 budget for work-from-home setup.
• Paid biannual in-person company summits.
• Quarterly $150 stipend for coworker meetups.
• Monthly $100 health and wellness benefit.
• Generous paid family leave.
• Annual $1,200 stipend for learning and development.
Tango
Accenture Federal Services
Strategize it Inc.
Accela
Get handpicked remote jobs straight to your inbox weekly.