
Machine Learning Software Engineer II
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in United States.
• Oversee the transition of machine learning models from conceptual prototypes to scalable and high-performance production systems.
• Design and implement ML solutions using AWS ECS (Elastic Container Service) for containerized workloads and AWS Lambda for serverless, event-driven inference pipelines.
• Enhance PyTorch models for production deployment by converting them to ONNX formats.
• Utilize advanced inference optimization techniques such as quantization, pruning, and ONNX Runtime, along with memory-efficient attention mechanisms like Flash Attention, to reduce latency and increase throughput.
• Advocate for infrastructure best practices in machine learning systems, establishing dependable CI/CD pipelines while ensuring secure, robust, and reproducible deployments across the AWS ecosystem.
• Create, develop, and assess algorithms that yield descriptive, diagnostic, predictive, and prescriptive insights from both structured and unstructured datasets.
• Produce clean, efficient, and thoroughly tested code, conducting rigorous testing, debugging, and documentation to facilitate seamless installation and long-term maintenance.
• Engage in research discussions, gather requirements, and contribute to system design alongside domain experts to develop customized scoring and ML solutions.
• 2–5 years of professional experience in Machine Learning Engineering, Software Engineering, or Data Science, demonstrating a successful history of architecting and deploying models in production.
• Extensive, hands-on experience with the AWS ecosystem, particularly AWS ECS and Lambda.
• Strong understanding of containerization (Docker) and event-driven architectures.
• High proficiency in modern programming languages relevant to ML (such as Python, C++, Java) and familiarity with industry-standard coding practices.
• Practical experience with PyTorch and other machine learning libraries (e.g., Scikit-Learn, TensorFlow).
• In-depth knowledge of model optimization processes, including PyTorch to ONNX conversions, ONNX Runtime, and scaling attention mechanisms (like Flash Attention).
• Experience with large-scale computing frameworks, data analysis systems, and both relational and non-relational databases.
• Equal Opportunity Employer
• Cultivating a culture that recognizes and celebrates diverse backgrounds, ideas, and experiences.
EXL
Headspace
Allstate
Sargent & Lundy
Get handpicked remote jobs straight to your inbox weekly.