Remotery

Product Manager – AI Inference, Model Serving

Posted 2 days ago

This is a fully remote position, open to applicants in Texas.

📋 Description

• Take ownership of the product strategy, roadmap, and lifecycle for inference and model serving, encompassing serverless inference, dedicated endpoints, autoscaling, routing, KV cache management, and associated observability.

• Conduct deep technical explorations with NeoClouds, sovereign clouds, and enterprise platform teams, converting insights into prioritized requirements and architectural direction.

• Collaborate with engineering on system design trade-offs covering runtime integration, GPU scheduling, network, storage, and serving topology, which includes disaggregated serving and multi-model serving.

• Establish positioning based on measurable outcomes such as latency distributions, throughput per GPU, utilization, tail reliability, and cost per tokens.

• Drive go-to-market strategies, including pricing and packaging, reference architectures, sizing guides, PoC playbooks, and direct interaction with customers, analysts, and ecosystem partners.


⛳️ Requirements

• Over 7 years of experience in product management, technical product management, or a senior technical role responsible for AI/ML and inference product(s).

• In-depth understanding of production AI inference, encompassing model serving, serverless execution, dedicated endpoints, autoscaling, routing, workload placement, observability, and reliability.

• Proven ability to evaluate performance trade-offs across GPU, network, storage, orchestration, and runtime layers, and to translate low-level technical capabilities into business value indicators such as TTFT, throughput per GPU, and TCO.

• Familiarity with modern inference runtimes (vLLM, SGLang, TensorRT-LLM, Dynamo, Triton) and the essential optimization patterns in production: continuous batching, KV cache management, cold starts, prefill versus decode, disaggregated serving, and multi-model serving.

• Established credibility with engineering leaders and infrastructure operators, demonstrating comfort in production architecture reviews and technical discussions with platform engineering stakeholders.


🏝️ Benefits

• Join an established leader in cloud infrastructure from Silicon Valley.

• Collaborate with exceptionally passionate, talented, and engaging colleagues, assisting Fortune 500 and Global 2000 clients in implementing next-generation cloud technologies.

• Be part of innovative, cutting-edge open-source projects.

• Flourish in the dynamic environment of a young company that values openness, collaboration, risk-taking, and continuous growth.

• Opportunities for professional development and training.

• Attend conferences and working groups.

• Customized workstation options (macOS, Windows).

• Competitive compensation package complemented by a robust benefits plan and stock options.

People also viewed

Anchor Utility11 hours ago

Rate Analyst

US flagTexas OnlyFull-timeUncategorized
ApplyView job
Honeywell11 hours ago

HSE Manager

US flagNorth Carolina OnlyFull-timeUncategorized
ApplyView job
Cision France11 hours ago

People Partner

CA flagCanada OnlyFull-timeUncategorized$85k/year
ApplyView job
Navigate Power11 hours ago

B2B Outside Sales Consultant

US flagPennsylvania OnlyFreelanceUncategorized$50k – $250k/year
ApplyView job
TELUS11 hours ago

Business Development Executive, Early Career – European Language Required

GB flagUnited Kingdom OnlyFull-timeUncategorized
ApplyView job
Gilead Sciences11 hours ago

Statistical Programmer II

US flagUnited States OnlyFull-timeUncategorized$107.2k – $138.7k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers