
Machine Learning Software Engineer II
Posted May 7

Posted May 7
This is a fully remote position, open to applicants in United States.
• Oversee the transformation of machine learning models from theoretical prototypes into scalable, high-performance production systems.
• Design and implement ML solutions using AWS ECS (Elastic Container Service) for containerized applications and AWS Lambda for serverless, event-driven inference pipelines.
• Enhance PyTorch models for production by converting them into ONNX formats.
• Utilize advanced inference optimization methods (quantization, pruning, ONNX Runtime) and memory-efficient attention techniques like Flash Attention to reduce latency and boost throughput.
• Advocate for infrastructure best practices in machine learning systems, setting up reliable CI/CD pipelines, and ensuring secure, robust, and reproducible deployments within the AWS environment.
• Create, develop, and assess algorithms that produce descriptive, diagnostic, predictive, and prescriptive insights from both structured and unstructured datasets.
• Write clean, efficient, and thoroughly tested code. Conduct rigorous testing, debugging, and documentation to facilitate seamless installation and long-term maintenance.
• Engage in research discussions, gather requirements, and collaborate on system design with domain experts to develop customized scoring and ML solutions.
• 2–5 years of professional experience in Machine Learning Engineering, Software Engineering, or Data Science, demonstrating a successful history of architecting and deploying models in production.
• Extensive, hands-on experience with the AWS ecosystem, particularly AWS ECS and Lambda.
• Strong grasp of containerization (Docker) and event-driven architectures.
• High proficiency in modern programming languages relevant to ML (e.g., Python, C++, Java) and familiarity with industry-standard coding practices.
• Practical experience with PyTorch and other machine learning libraries (e.g., Scikit-Learn, TensorFlow).
• Comprehensive understanding of model optimization workflows, including PyTorch to ONNX conversions, ONNX Runtime, and scaling attention mechanisms (e.g., Flash Attention).
• Experience with large-scale computing frameworks, data analysis systems, and both relational and non-relational databases.
• Equal Opportunity Employer
• Cultivating a culture that values diverse backgrounds, ideas, and experiences.
Airbnb
Onsights.io
Flock Safety
Inspiren
Get handpicked remote jobs straight to your inbox weekly.