
Senior MLOps, Data Systems Engineer
Posted 20 hours ago

Posted 20 hours ago
This is a fully remote position, open to applicants in Canada.
• Development of ML Pipelines & Data Systems: Create, construct, and sustain scalable pipelines encompassing data ingestion, annotation, validation, training, evaluation, and deployment, ensuring reproducibility, consistency, and traceability throughout the entire ML lifecycle.
• Integration of Data & Annotation Pipelines: Design and merge annotation workflows with upstream data ingestion and training systems, facilitating efficient task creation, labeling, quality assurance, and dataset updates that directly contribute to model iterations.
• Data-Centric Iteration: Evaluate model performance and identify failures, driving targeted data enhancements by linking production signals, data mining, and annotation workflows into continuous feedback loops.
• Experimentation & Reproducibility: Establish systems for tracking experiments, versioning datasets, and maintaining model lineage to allow for reliable comparisons and iterations across various experiments.
• CI/CD for Machine Learning: Develop and uphold CI/CD workflows specifically tailored for ML systems, enabling automated testing, validation, and deployment of models and pipelines.
• Support for Model Deployment: Work alongside embedded and platform teams to assist in deploying models to edge environments, ensuring compatibility, performance, and reliability.
• Monitoring & Feedback Loops: Set up monitoring, logging, and feedback systems to oversee model performance in production and promote continuous improvement through data and model iterations.
• Compute Optimization: Enhance training and inference workflows across cloud environments, ensuring efficient utilization of GPU and compute resources.
• Cross-Functional Collaboration: Collaborate closely with applied scientists, embedded engineers, and data teams to guarantee alignment across data workflows, model development, and deployment systems.
• End-to-End Contribution: Engage in and enhance the complete ML lifecycle, from raw data ingestion and annotation to training, evaluation, deployment support, and post-deployment analysis.
• Over 5 years of professional experience in MLOps, ML infrastructure, data systems, Machine Learning Engineering, or similar roles.
• Proficient programming skills in Python, with experience in ML frameworks such as PyTorch or TensorFlow.
• Proven experience in building and maintaining comprehensive ML pipelines, covering data ingestion, annotation, training, evaluation, and deployment workflows.
• Background in designing or integrating annotation and data curation workflows, with an understanding of how labeled data influences model performance.
• Strong comprehension of dataset versioning, data lineage, and reproducibility within machine learning systems.
• Experience with experiment tracking and managing the model lifecycle.
• Familiarity with CI/CD tools (e.g., GitHub Actions, GitLab CI, Jenkins) and their application to machine learning workflows.
• Knowledge of containerization (Docker) and workflow orchestration systems.
• Experience with cloud-based ML environments (e.g., AWS) and distributed training workflows.
• Strong understanding of real-world data challenges, such as noisy inputs, edge cases, and variability across environments.
• Excellent problem-solving and debugging abilities, especially in complex, multi-stage systems.
• Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field (or equivalent practical experience).
• Equity options available
• Performance bonus offered
Jellyfish
ScalableOS
Pragmatike
Get handpicked remote jobs straight to your inbox weekly.