
Senior Platform Engineer
Posted 20 hours ago

Posted 20 hours ago
This is a fully remote position, open to applicants in California.
• Redesign and define the cloud infrastructure on AWS and Azure.
• Transition all current AWS and Azure setups to OpenTofu/Terraform and Ansible; establish module standards, remote state management, and GitOps-driven plan/apply pipelines.
• Evaluate the cloud infrastructure against the AWS and Azure Well-Architected Frameworks; create a remediation backlog and ensure its completion across networking, IAM, landing zones, account structure, and cost governance.
• Implement policy-as-code (OPA/Conftest, AWS SCPs, Azure Policy) to enforce security measures, tagging practices, and compliance guidelines at the platform level.
• Develop and maintain reusable Terraform modules for compute, networking, storage, databases, and identity management.
• Establish FinOps standards: tagging taxonomy, cost allocation dashboards, and recommendations for rightsizing.
• Design and execute a comprehensive observability stack.
• Set SLIs and SLOs for all shared platform services and critical applications.
• Initiate the SRE practice from the ground up: create incident runbooks, post-incident review templates, and conduct at least one chaos engineering exercise.
• Collaborate with engineering teams to instrument their services and develop operational dashboards.
• Launch and manage a developer portal as the single access point.
• Create and sustain golden paths for the most frequent developer workflows.
• Oversee the CI/CD platform layer: standardized pipeline templates and reusable workflow libraries.
• Work alongside peer managers and teams to plan and facilitate the migration of existing workloads to the platform.
• Over 5 years of experience in platform, infrastructure, or DevOps engineering with direct production accountability on AWS and/or Azure.
• Extensive expertise in OpenTofu/Terraform: module creation, state management, workspace strategy, remote backends, and CI/CD integration; knowledge of Terramate is a plus.
• Proficient in Kubernetes operations: EKS and/or AKS cluster lifecycle, Helm, admission controllers, RBAC, network policies, and autoscaling.
• Practical experience in observability with two or more of the following: Prometheus, Grafana, Loki, Tempo, Datadog, or OpenTelemetry — including SLI/SLO definition and alert engineering.
• Experience with CI/CD platforms: authoring GitHub Actions pipelines, designing reusable workflows, and managing container build/scan pipelines.
• Familiarity with GitOps: ArgoCD or Flux for Kubernetes continuous delivery; knowledge of progressive delivery patterns (canary, blue-green) is a significant advantage.
• Experience with IDP: Backstage or similar developer portal, GitHub, scaffolding templates, service catalog design, or self-service provisioning tools.
• Security-focused mindset: policy-as-code, IaC scanning, secrets management, container hardening, and shift-left security practices.
• Excellent communication and documentation skills; capable of presenting architectural decisions to engineering peers and leadership.
• Comprehensive benefits package including health, dental, and vision insurance.
• 401(k) plan with company matching contributions.
• Generous paid time off to promote your well-being.
• Flexible work environment, whether remote, hybrid, or in-office.
Tango
Accenture Federal Services
Strategize it Inc.
Accela
Get handpicked remote jobs straight to your inbox weekly.