
Staff Software Engineer, Agentic Platform
Posted 2 hours ago

Posted 2 hours ago
This is a fully remote position, open to applicants in California.
• Become a part of Docker's Agentic Platform team to develop the essential infrastructure that will drive the next wave of AI-powered workflows.
• You will be engaged in the core agent execution runtime, orchestration components, and the cloud infrastructure that ensures the Agentic Platform operates continuously.
• This role requires high ownership: you won't merely be building systems; you'll also manage them, address failures, and promote ongoing enhancements throughout the stack.
• This is a unique opportunity to influence the design and operation of agents at scale.
• Collaborate with experienced engineers while working alongside partner teams in AI infrastructure, developer experience, and platform reliability.
• A minimum of 8 years of professional, hands-on, full-time software engineering experience focused on backend, infrastructure, or platform engineering.
• Cloud Platform Expertise (AWS/OCI/Azure/GCP): Demonstrated, hands-on experience managing production services in AWS or Oracle Cloud Infrastructure, covering compute, networking, managed services, IAM, and cost management. This is essential; the Agentic Platform is a cloud-native service operating 24/7.
• Service Ownership in a Cloud Environment: You have managed production services from start to finish — including on-call duties, incident response, SLO definition, and post-mortems. You don’t just build; you maintain what you create.
• Distributed Systems Design: Comprehensive understanding of fault tolerance, consistency, observability, and scalability within cloud-native settings.
• Backend Engineering Expertise: Strong command of at least one backend programming language utilized for systems development — Go, Python, Rust, or Java.
• A Bachelor's degree in Computer Science, Engineering, or a related discipline, or equivalent practical experience.
• Strongly Preferred: Go: Professional expertise in Go — the primary programming language for Docker's backend systems.
• Infrastructure as Code: Proficiency with Terraform for cloud infrastructure provisioning and Helm for Kubernetes application packaging and deployment.
• Data Infrastructure: Experience with PostgreSQL and Redis / Pub-Sub paradigms for state management, caching, and event-driven agent workflows.
• MCP & Agent Tooling: Familiarity with the design and integration of MCP (Model Context Protocol) servers.
• Container & Orchestration: Experience with Docker, Kubernetes, or similar technologies — particularly in relation to agent sandboxing and secure code execution environments.
• AI-assisted development tools: Knowledge of tools like Cursor, Claude Code, Copilot, Windsurf, etc., and the developer personas who utilize them.
• Agent Evaluation: Experience with LLM-as-judge frameworks, behavioral regression testing, and managing golden datasets.
• Agent Systems Experience: Practical experience in constructing or managing AI agent systems — including multi-agent orchestration, tool utilization, memory systems, or agent evaluation frameworks.
• Open Source: Contributions or active participation in relevant open source projects.
• Must possess legal authorization to work in the United States.
• Freedom & flexibility; adapt your work to fit your life.
• Designated quarterly Whaleness Days plus an end-of-year Whaleness break.
• Home office setup; we prioritize your comfort while you work.
• 16 weeks of paid parental leave (after 6 months of employment).
• Technology stipend of $100 USD net/month.
• PTO plan that encourages you to take time for personal enjoyment.
• Training stipend for conferences, courses, and classes.
• Equity; as a growing startup, we want all employees to have a stake in the company's success.
• Docker Swag.
• Medical benefits, retirement plans, and holidays vary by country.
• Remote-first culture, with offices located in Seattle and Paris.
Focus
Trellis
Mattel, Inc.
Milliman
Get handpicked remote jobs straight to your inbox weekly.