
Forward Deployed Engineer, AI Inference, vLLM, Kubernetes
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in California, +4 more states.
• Manage Distributed Inference: Implement and set up LLM-D and vLLM on Kubernetes clusters.
• Enhance Production Efficiency: Conduct performance testing and optimize vLLM settings.
• Collaborate on Code Development: Partner with customer engineers to produce high-quality production code.
• Tackle Complex Challenges: Resolve intricate interactions between model architectures and hardware accelerators.
• Establish Feedback Mechanisms: Relay insights from the field back to product development.
• Over 8 Years of Engineering Experience
• Strong Customer Engagement Skills
• Proactive Approach to Problem Solving
• Extensive Knowledge of Kubernetes
• Expertise in AI Inference
• Proficient in Systems Programming with Python and Go
• Familiarity with Infrastructure as Code, including Helm, Terraform, or similar tools
• Understanding of Cloud and GPU Hardware
• Experience with open-source AI infrastructure projects is advantageous
• Familiarity with Envoy Proxy or Inference Gateway (IGW) is a bonus
• Comprehensive medical, dental, and vision coverage
• Flexible Spending Account for healthcare and dependent care
• Health Savings Account for high deductible medical plans
• 401(k) retirement plan with employer matching
• Paid time off and holidays
• Paid parental leave for all new parents
• Leave benefits encompassing disability, paid family medical leave, and paid military leave
• Additional perks including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, and employee assistance program
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.