
AI Research Engineer – Multi-Modal, Vision
Posted 2 hours ago

Posted 2 hours ago
This is a fully remote position, open to applicants in United Kingdom.
• Perform comprehensive research and engineering tasks on vision-language models, encompassing training, evaluation, and optimization throughout the entire model development lifecycle.
• Create and execute post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning based on human feedback.
• Develop and sustain high-quality multimodal datasets, focusing on data curation, filtering, and balancing for domain-specific applications.
• Enhance model efficiency and deployment capabilities, tailoring models for environments with limited resources through compression and optimization strategies.
• Establish and implement evaluation frameworks and benchmarks to assess model performance, robustness, and success in real-world tasks.
• Construct and scale training workflows utilizing distributed GPU infrastructure.
• Detect and address bottlenecks in training pipelines to achieve leading model quality on designated benchmarks.
• Contribute to and utilize open-source ecosystems, including models, datasets, and tools, to expedite development processes.
• Remain updated with the latest advancements in multimodal learning and vision-language systems, translating relevant insights into actionable improvements.
• Publish research outcomes in prestigious AI conferences and journals when applicable.
• Bachelor’s degree in Computer Science, Machine Learning, or a related discipline; MS/PhD is preferred.
• Extensive experience with multimodal post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback.
• Practical experience with parameter-efficient fine-tuning and distributed training frameworks.
• Proven track record of building and enhancing vision-language models with quantifiable results on standard benchmarks or real-world applications.
• Experience in adapting models for environments with limited resources.
• Documented contributions to open-source multimodal AI projects on GitHub or HuggingFace.
• Publications in leading AI conferences (NeurIPS, ICML, ICLR, CVPR, ECCV, etc.).
• Flexible working hours
• Professional development opportunities
Baylor Genetics
NVIDIA
Tether.to
Brahma
Get handpicked remote jobs straight to your inbox weekly.