
Senior MLOps Engineer
Posted May 23

Posted May 23
This is a fully remote position, open to applicants in India.
• Design, implement, and maintain comprehensive ML pipelines that encompass training, validation, deployment, monitoring, and retraining, prioritizing production reliability and long-term sustainability.
• Take ownership of and manage production ML infrastructure through Infrastructure as Code (IaC), making architectural trade-offs and upholding best practices.
• Lead CI/CD methodologies for ML, which include artifact/model versioning, promotion, rollout/rollback processes, and ensuring parity across development, testing, and production environments.
• Deploy and manage ML/GenAI workloads on Azure utilizing Azure App Service and Azure Container Apps, with monitoring facilitated by Application Insights.
• Establish robust model observability through performance monitoring, data quality assessments, drift detection, alerting mechanisms, and dashboard creation.
• Drive optimization of compute resources and costs for training and inference, including scaling policies, capacity planning, and evaluating cost/performance trade-offs.
• Support operational needs for GenAI, including LLM inference patterns, embeddings, and retrieval pipelines; implement hooks for evaluations and guardrails as necessary.
• Ensure that ML systems adhere to security and governance standards, including RBAC/least privilege, secrets management, audit logging, encryption, and secure access protocols.
• Collaborate with the Data & AI Architect to convert architectural standards into reusable pipeline templates and operational controls.
• Work together with and mentor AI engineers, contributing to model development and experimentation as time permits.
• A minimum of 5 years of experience in MLOps, ML engineering, platform engineering, or a related field, with at least 2 years in a senior or leadership role.
• Strong expertise in Python for ML workflows, automation, and pipeline creation.
• Practical experience in building and managing ML systems on Azure (exposure to OCI is advantageous).
• A proven history of managing production-grade MLOps pipelines from start to finish (training → deployment → monitoring → retraining) with demonstrable reliability or efficiency results.
• Extensive experience with Infrastructure as Code, specifically Terraform or similar tools.
• Familiarity with MLOps tools such as MLflow (or equivalent experiment tracking systems) and CI/CD pipelines.
• Experience in containerizing services using Docker in production settings.
• Hands-on experience in deploying and monitoring services on Azure, utilizing Azure App Service, Azure Container Apps, and Application Insights.
• Strong understanding of GenAI/LLM-based systems, including inference workflows, embeddings, and retrieval/RAG components, along with their operational considerations.
• Excellent communication and collaboration abilities; adept at working across various functions and influencing technical decisions without direct authority.
• Company Paid Group Mediclaim Insurance for employee, spouse, and up to 2 kids of INR 400,000 per annum.
• Company Paid Group Personal Accident Insurance for employees of INR 1,000,000 per annum.
• Company Paid & Manager Approved Career Advancement Opportunities.
• Best in the Industry Referral Bonus Policy.
• 29 Paid Leaves throughout the year.
• Company-Paid Maternity Leaves for female employees.
Hyatt
Scopic
Perform
Greenlight Planet
Get handpicked remote jobs straight to your inbox weekly.