
Platform Infrastructure Engineer
Posted 1 day ago

Posted 1 day ago
• Design, construct, and maintain infrastructure based on Azure, primarily focusing on Azure Kubernetes Service (AKS) to ensure reliability, scalability, and enhance developer experience.
• Architect and manage infrastructure that guarantees continuous availability, encompassing zero-downtime deployments, automated rollouts, and the capability to adjust capacity in response to predictable demand fluctuations throughout the year.
• Take ownership of system reliability and maintenance practices, which include patching, upgrades, and configuration management across various environments, ensuring that the infrastructure is healthy, up-to-date, and ready for audits.
• Create and uphold disaster recovery and business continuity strategies, including documented runbooks, validated recovery procedures, rollback strategies, and data recovery protocols that can be executed with confidence when necessary.
• Develop and document reusable tools, networking patterns, and infrastructure templates for engineering teams to adopt.
• Collaborate across functions with engineering teams during upcoming infrastructure changes or while understanding their requirements.
• Manage and enhance CI/CD pipelines utilizing GitHub Actions, ensuring the swift, reliable, and secure delivery of workflows.
• Oversee infrastructure-as-code using Terraform, enabling consistent and auditable provisioning across different environments.
• Implement and maintain observability and monitoring solutions, including Grafana dashboards and alerts, to provide teams with clear insights into system health.
• Manage identity and access through Microsoft Entra ID, applying least-privilege principles across services and teams.
• Approach all infrastructure tasks with a security-first perspective, proactively identifying risks, enforcing compliance standards, and addressing deviations from established operating procedures.
• Effectively communicate with stakeholders and neighboring teams regarding infrastructure changes, timelines, and dependencies.
• Contribute to the team's knowledge repository by creating runbooks, architecture documentation, and onboarding materials.
• A minimum of 4 years of hands-on experience with Azure infrastructure in a production setting.
• Extensive experience with Azure Kubernetes Service (AKS), including cluster management, networking, scaling, gitops, and day-2 operations.
• Strong comprehension of cloud networking, covering VNets, NSGs, private endpoints, DNS, and ingress/egress patterns.
• Familiarity with infrastructure-as-code, with a preference for Terraform.
• Proficient in CI/CD tools, especially GitHub Actions.
• Ability to work effectively in a small, remote team with a significant level of autonomy and ownership.
• Excellent written and verbal communication skills — capable of working cross-functionally, clearly explaining technical decisions, and keeping stakeholders informed.
• Security-oriented approach to infrastructure design and operations.
• Requirement to be in the Eastern or Central time zone for team collaboration.
• Unlimited Vacation Policy + Sick Time + Holidays
• Paid Parental Leave
• Fully Remote Opportunity
• Healthcare Benefits and 401K
• A culture that supports growth from Startup to Scale Up
Bullhorn
Rocket Money (formerly Truebill)
CrowdStrike
Nebius Group
Get handpicked remote jobs straight to your inbox weekly.