Remotery

Senior Machine Learning Operations Engineer

Posted Jun 20

This is a fully remote position, open to applicants in California, +2 more states.

πŸ“‹ Description

β€’ Develop and manage the real-time inference service that evaluates models for the risk decision engine, prioritizing low latency and high availability.

β€’ Take ownership of the model deployment infrastructure, including registry and versioning, CI/CD processes with performance, bias, and consistency evaluations, shadow mode, and staged rollouts.

β€’ Establish model observability, ensuring monitoring of availability, latency, and errors, along with drift detection to trigger retraining.

β€’ Collaborate with Risk Data Science to transition models from a seamless development-to-production handoff to operational production under MLP oversight.

β€’ Implement experimentation features such as champion/challenger and canary routing, as well as explainability outputs like SHAP attributions.

β€’ Exhibit a strong sense of product ownership and proactively pursue responsibilities β€” our team self-organizes on small to medium projects, and we seek someone eager to contribute to the creation of a new platform team.


⛳️ Requirements

β€’ A minimum of 5 years of experience in machine learning engineering, backend software engineering, MLOps, or a related discipline.

β€’ Proven experience with production ML services β€” deploying, serving, and managing models in environments requiring low latency and high availability.

β€’ Solid backend engineering skills in Python, including familiarity with API frameworks such as FastAPI or Flask.

β€’ Experience with model deployment and lifecycle management tools: model registries, CI/CD for models, version control, and staged rollout strategies (shadow, canary, champion/challenger).

β€’ Background in creating observability and alerting systems for production services β€” focusing on latency, errors, and ideally model-specific metrics like drift.

β€’ Proficiency with the data infrastructure essential to ML, including SQL, key-value/low-latency stores (Redis, DynamoDB, or similar), and streaming pipelines (Kafka, Kinesis, Redpanda, or equivalent).


🏝️ Benefits

β€’ Competitive salary

β€’ Equity

β€’ Health insurance plans

β€’ Paid time off

β€’ Remote work options

People also viewed

Flock Safety10 hours ago

Full Stack Engineer, Machine Learning Tooling

US flagNew York OnlyFull-timeMachine Learning Engineer$145k – $165k/year
ApplyView job
Inspiren10 hours ago

Senior Machine Learning Engineer

US flagNew York OnlyFull-timeMachine Learning Engineer$200k – $230k/year
ApplyView job
OneStudyTeam10 hours ago

Senior Machine Learning Engineer

US flagUnited States OnlyFull-timeMachine Learning Engineer$140k – $190k/year
ApplyView job
CDW10 hours ago

Senior ML, MLOps Engineer

US flagUnited States OnlyFull-timeMachine Learning Engineer
ApplyView job
Extend11 hours ago

Manager, Machine Learning

US flagUnited States OnlyFull-timeMachine Learning Engineer$180k – $210k/year
ApplyView job
CD PROJEKT SA11 hours ago

Machine Learning, Game Tech Architect

CA flagCanada OnlyFull-timeMachine Learning Engineer$180.1k – $247.6k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers