
Senior Platform Engineer
Posted 1 day ago

Posted 1 day ago
• Take ownership and enhance our AWS infrastructure encompassing compute, networking, storage, and managed services.
• Design and sustain infrastructure that ensures high availability, reliable performance, and financial accuracy.
• Lead architectural decisions at the platform level, including service migrations and runtime modifications (e.g., Redis → Valkey, EKS → ECS/Fargate).
• Ensure that infrastructure selections are in line with reliability, cost-effectiveness, and operational simplicity—not merely following trends.
• Create and maintain deployment pipelines that are secure, repeatable, and transparent.
• Take responsibility for system reliability through capacity planning, failure modeling, and controlled change management.
• Lead the response to incidents and conduct root-cause analysis for infrastructure-level failures.
• Engage in on-call rotations and continually enhance operational ergonomics.
• Establish and uphold strong observability across infrastructure and services (metrics, logs, tracing, alerting).
• Ensure the secure configuration of AWS resources, IAM policies, secrets management, and network boundaries.
• Proactively identify infrastructure risks related to scalability, cost, or security and address them before they escalate into incidents.
• Collaborate closely with application engineers to ensure that platform constraints and capabilities are thoroughly understood.
• Drive infrastructure modifications through direct implementation.
• Set standards and best practices for infrastructure, deployment, and operations as the team expands.
• Mentor fellow platform engineers and contribute to enhancing the overall operational maturity of the organization.
• More than 8 years of experience in building and managing production infrastructure within cloud environments.
• Extensive experience with core AWS services (EC2, ECS/EKS, VPC, IAM, RDS, ElastiCache, ALB/NLB, CloudWatch, etc.).
• Strong comprehension of containerized workloads and the trade-offs of orchestration.
• Proven track record in designing systems that achieve high availability, fault tolerance, and managed failures.
• Practical experience with infrastructure as code (Terraform, CloudFormation, or similar tools).
• Demonstrated capability to safely plan and execute infrastructure migrations.
• Experience in troubleshooting real production incidents related to networking, scaling, or service degradation.
• Generous PTO and company holiday policy, along with company-paid Short Term Disability.
• 100% employer-covered health and dental insurance for our direct employees (a standard plan is covered, with higher-tier healthcare options available at the employee's additional cost; dependent coverage is at the employee's expense); vision plan available at the employee's additional cost.
• Child Care Benefits and generous parental leave.
Northmill
Software Mind
Guidehouse
Ford Motor Company
Get handpicked remote jobs straight to your inbox weekly.