This is a fully remote position, open to applicants in United Kingdom.

📋 Description

• Perform comprehensive research and engineering tasks on vision-language models, encompassing training, evaluation, and optimization throughout the entire model development lifecycle.

• Create and execute post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning based on human feedback.

• Develop and sustain high-quality multimodal datasets, focusing on data curation, filtering, and balancing for domain-specific applications.

• Enhance model efficiency and deployment capabilities, tailoring models for environments with limited resources through compression and optimization strategies.

• Establish and implement evaluation frameworks and benchmarks to assess model performance, robustness, and success in real-world tasks.

• Construct and scale training workflows utilizing distributed GPU infrastructure.

• Detect and address bottlenecks in training pipelines to achieve leading model quality on designated benchmarks.

• Contribute to and utilize open-source ecosystems, including models, datasets, and tools, to expedite development processes.

• Remain updated with the latest advancements in multimodal learning and vision-language systems, translating relevant insights into actionable improvements.

• Publish research outcomes in prestigious AI conferences and journals when applicable.

⛳️ Requirements

• Bachelor’s degree in Computer Science, Machine Learning, or a related discipline; MS/PhD is preferred.

• Extensive experience with multimodal post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback.

• Practical experience with parameter-efficient fine-tuning and distributed training frameworks.

• Proven track record of building and enhancing vision-language models with quantifiable results on standard benchmarks or real-world applications.

• Experience in adapting models for environments with limited resources.

• Documented contributions to open-source multimodal AI projects on GitHub or HuggingFace.

• Publications in leading AI conferences (NeurIPS, ICML, ICLR, CVPR, ECCV, etc.).

🏝️ Benefits

• Flexible working hours

• Professional development opportunities

AI Research Engineer – Multi-Modal, Vision

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Lead Bioinformatics AI Scientist

Senior AI Researcher – World Foundation Models

AI Research Engineer – Pre-training, LLM, Multi-Modal

Machine Learning Researcher – Speech/Audio

Senior AI Researcher

Staff AI Researcher

Never miss a great job!