This is a fully remote position, open to applicants in United States.

📋 Description

• Oversee and assist a scrum team comprising 4 to 6 software and DevOps engineers.

• Facilitate sprint planning, backlog refinement, execution tracking, retrospectives, and delivery commitments.

• Mentor engineers in technical decision-making, operational ownership, prioritization, and career development.

• Foster a team culture that emphasizes accountability, reliability, continuous improvement, and pragmatic execution.

• Collaborate with product, engineering, and business stakeholders to synchronize infrastructure and bidder team priorities with organizational objectives.

• Supervise AWS-based cloud infrastructure that supports high-volume, low-latency bidder systems.

• Manage cloud environments that handle over 1 billion HTTP requests per hour.

• Assist in optimizing and managing millions of dollars in cloud infrastructure expenses.

• Identify opportunities for cost reduction, enhanced system utilization, and performance improvement.

• Propel enhancements in observability, monitoring, alerting, incident response, deployment safety, and operational readiness.

• Ensure systems are architected and operated for scalability, reliability, availability, and maintainability.

• Lead engineering efforts on high-volume bidder software for a demand-side platform.

• Support backend systems where latency, throughput, uptime, and cost efficiency are critical business factors.

• Aid the team in making informed technical trade-offs regarding performance, reliability, complexity, and delivery speed.

• Collaborate with engineers on architecture, system design, code quality, deployment practices, and production operations.

• Ensure the team maintains a strong sense of ownership over production services from design through deployment and ongoing operations.

• Participate in production support processes and ensure adequate team coverage.

• Be available outside of standard business hours when necessary to assist with critical deployments, production incidents, time-sensitive operational events, and team response coverage.

• Lead or assist in incident reviews, root cause analyses, follow-up planning, and long-term reliability enhancements.

• Establish processes that minimize operational toil and enhance team response effectiveness over time.

⛳️ Requirements

• 6+ years of professional experience in backend software engineering, cloud infrastructure, DevOps, site reliability engineering, or a related technical area.

• 1+ year of experience in leading engineers, managing projects, serving as a technical lead, or holding an engineering management position.

• Extensive experience managing production systems on AWS.

• Experience with high-scale, high-availability distributed systems.

• Practical knowledge of cloud networking, HTTP services, load balancing, service scaling, monitoring, deployment pipelines, and incident response.

• Experience supporting systems where latency, throughput, reliability, and cost are critical operational concerns.

• Ability to prioritize technical tasks across infrastructure, operations, performance, and product delivery.

• Excellent communication skills with the ability to collaborate effectively across engineering, product, and business teams.

• Comfortable engaging in technical details while managing people, planning, execution, and delivery.

🏝️ Benefits

• Healthcare

• Company contributions to your HSA

• A great 401k

• Much (much) more.

Engineering Manager, CloudOps Infrastructure

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Rate Analyst

HSE Manager

People Partner

B2B Outside Sales Consultant

Business Development Executive, Early Career – European Language Required

Statistical Programmer II

Never miss a great job!