This is a fully remote position, open to applicants in Italy.

📋 Description

• Perform comprehensive research and engineering on vision-language models, encompassing training, evaluation, and optimization throughout the entire model development lifecycle.

• Create and execute post-training pipelines, which include supervised fine-tuning, knowledge distillation, and reinforcement learning based on human feedback.

• Develop and sustain high-quality multimodal datasets, involving data curation, filtering, and balancing tailored for domain-specific tasks.

• Enhance model efficiency and deployability, modifying models for resource-limited environments through compression and optimization strategies.

• Design and execute evaluation frameworks and benchmarks to assess model performance, robustness, and success in real-world applications.

• Construct and scale training workflows across distributed GPU infrastructure.

• Identify and mitigate bottlenecks in training pipelines to attain state-of-the-art model quality on specified benchmarks.

• Contribute to and utilize open-source ecosystems, including models, datasets, and tools, to expedite development.

• Keep abreast of the latest advancements in multimodal learning and vision-language systems, applying relevant discoveries to enhance practical outcomes.

• Publish research findings in prestigious AI conferences and journals as appropriate.

⛳️ Requirements

• Bachelor's degree in Computer Science, Machine Learning, or a related discipline; a Master's or PhD is preferred.

• Extensive experience with multimodal post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback.

• Practical experience with parameter-efficient fine-tuning and distributed training frameworks.

• Proven capability to develop and enhance vision-language models with quantifiable results on standard benchmarks or practical applications.

• Experience in adapting models for resource-constrained environments.

• Documented contributions to open-source projects in multimodal AI on platforms like GitHub or HuggingFace.

• Publications in leading AI conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ECCV, etc.).

🏝️ Benefits

• Remote work

• Flexible work hours

• Professional development opportunities

AI Research Engineer – Multi-Modal, Vision

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Machine Learning Researcher – Speech/Audio

Senior AI Researcher

Staff AI Researcher

Senior Applied Researcher – 3D Reconstruction, Deep Learning

AI Research Scientist, Applied AI

Senior Generative AI Scientist II – Model Risk & Validation

Never miss a great job!