
Senior ML Engineer
Posted 4 hours ago

Posted 4 hours ago
This is a fully remote position, open to applicants in California, +4 more states.
• Oversee the entire MLOps process and Productionization: Design, implement, and sustain CI/CD pipelines for ML artifacts, which encompass model evaluation, versioning, and automated deployment. Act as the primary Subject Matter Expert (SME) for ensuring operational excellence throughout the Invoca ML stack.
• Design and Enhance SLM/LLM Deployment: Manage the complete inference infrastructure, including model serving on Triton Inference Server, Baseten, and Kubernetes-based GPU environments. Profile and optimize for minimal latency and maximum throughput, while creating robust and scalable APIs for both internal and external model accessibility.
• Fine-Tune Language Models: Utilize parameter-efficient fine-tuning techniques (LoRA, QLoRA, PEFT) to modify transformer-based SLMs and LLMs for impactful NLP applications in conversation intelligence.
• Advance ML Infrastructure: Contribute to the development of model training infrastructure, data pipelines, and foundational data lakes to ensure that systems supporting our models remain reliable and scalable.
• Collaborate with Various Teams: Work closely with Data Scientists, Data Engineers, and Applied AI Engineers to construct the essential ML systems that drive Invoca's agentic AI products.
• Deliver Customer Value: Collaborate with product and engineering teams to identify customer needs and deploy ML solutions that result in measurable improvements.
• Over 5 years of experience in ML Engineering with a strong emphasis on production.
• Proficient in Python and deep learning frameworks (PyTorch, HuggingFace Transformers, spaCy).
• Proven experience in deploying and maintaining transformer-based NLP models in a production environment.
• Practical knowledge of fine-tuning SLMs/LLMs (LoRA, QLoRA, PEFT) and optimizing models through quantization, batching, and throughput tuning.
• Expertise in inference infrastructure, including Triton, Baseten, vLLM, TGI, SageMaker, Vertex AI, or similar platforms.
• Experience in developing production-grade APIs that provide ML models to downstream users.
• Familiarity with MLOps tools, model monitoring, and evaluation platforms (Braintrust, MLflow, or equivalent).
• Bachelor’s degree in Computer Science, Engineering, Statistics, or a related field; advanced degrees are an advantage.
• Knowledge of Reinforcement Learning from Human Feedback (RLHF) or preference training is a plus.
• Flexible Time Off – We promote a balanced work-life dynamic. Our flexible paid time off policy enables you to recharge and take necessary breaks.
• Paid Holidays – Invoca offers 16 paid holidays in the U.S., including a winter break, allowing ample time to relax and connect with family and friends.
• Health Benefits – Our healthcare offerings include medical, dental, and vision coverage, with various plan options to suit your and your family’s needs. Fertility assistance is also part of the program.
• Retirement – Invoca provides a 401(k) plan through Fidelity, featuring a company match of up to 4%.
• Stock Options – Every employee has the opportunity to participate in Invoca’s success through stock options.
• Mental Health Program – Support for well-being across a wide range of issues is available through our SpringHealth program.
• Paid Family Leave – We offer up to 6 weeks of fully paid leave for baby bonding, adoption, and family care.
• Paid Medical Leave – Up to 12 weeks of fully paid leave is available for childbirth and medical needs.
• InVacation – As a gesture of appreciation for our long-term team members, we provide a bonus after 7 years of service.
• Wellness Subsidy – We offer a subsidy that can be utilized for gym memberships, fitness classes, and similar activities.
Prima
AAA Life Insurance Company
Orita
Nagarro
Get handpicked remote jobs straight to your inbox weekly.