
Senior Infrastructure Engineer, Government Systems
Posted 2 hours ago

Posted 2 hours ago
• Lead the design and advancement of RKE2 Kubernetes clusters across development, acceptance, and production environments — steering upgrades, capacity planning, networking strategies, and operational resilience.
• Architect and sustain infrastructure as code (Terraform) to provision and manage AWS-based environments, creating patterns and modules that the team can confidently scale.
• Take ownership of the CI/CD platform (GitHub Actions) from end-to-end — crafting pipeline architecture, enhancing build performance, and ensuring reliable release flows through staged environments.
• Propel a GitOps-based delivery strategy utilizing Flux CD, establishing standards for Helm charts and Kustomize overlays while ensuring consistent reconciliation across clusters.
• Define policies for the container image lifecycle — including building, signing, storing, and distributing images across OCI registries for various deployment targets.
• Identify and eliminate operational toil through automation, enhancing environment provisioning, configuration management, and deployment processes at a systemic level.
• Establish and uphold monitoring, alerting, and incident response practices — managing dashboards, runbooks, and post-incident reviews that enhance platform reliability over time.
• Act as a technical liaison for product service teams integrating into the deployment pipeline — unblocking teams, troubleshooting intricate environment issues, and promoting infrastructure best practices.
• Manage the security posture of the deployment platform — overseeing secrets, certificates, RBAC policies, and security configurations to fulfill compliance and operational security requirements.
• Mentor and elevate other engineers on the team through code reviews, pairing, design discussions, and knowledge sharing.
• Extensive experience in operating Kubernetes in production at scale — including cluster lifecycle management, Helm, networking, persistent storage, performance tuning, and troubleshooting complex workloads.
• Expert-level skills in Terraform for provisioning and managing cloud infrastructure across multiple environments, with a focus on module design and state management strategies.
• Strong Linux systems administration expertise (RHEL or similar — encompassing networking, storage, systemd, performance analysis, and shell scripting).
• Significant experience in designing and maintaining CI/CD pipelines (GitHub Actions, Jenkins, or similar), prioritizing reliability, speed, and developer experience.
• Profound familiarity with AWS services commonly utilized in infrastructure (EC2, S3, VPC, IAM, EBS, Lambda, and associated networking and security services).
• A robust operational mindset — taking ownership of reliability, proactively considering failure modes, and developing automation to prevent recurring issues before they affect customers.
• Offers Equity
• Offers Bonus
32Co
Teleperformance
Rocket Money (formerly Truebill)
Bullhorn
Get handpicked remote jobs straight to your inbox weekly.