
AI Research Engineer – Multi-Modal, Vision
Posted Jun 20

Posted Jun 20
This is a fully remote position, open to applicants in Italy.
• Perform comprehensive research and engineering on vision-language models, encompassing training, evaluation, and optimization throughout the entire model development lifecycle.
• Create and execute post-training pipelines, which include supervised fine-tuning, knowledge distillation, and reinforcement learning based on human feedback.
• Develop and sustain high-quality multimodal datasets, involving data curation, filtering, and balancing tailored for domain-specific tasks.
• Enhance model efficiency and deployability, modifying models for resource-limited environments through compression and optimization strategies.
• Design and execute evaluation frameworks and benchmarks to assess model performance, robustness, and success in real-world applications.
• Construct and scale training workflows across distributed GPU infrastructure.
• Identify and mitigate bottlenecks in training pipelines to attain state-of-the-art model quality on specified benchmarks.
• Contribute to and utilize open-source ecosystems, including models, datasets, and tools, to expedite development.
• Keep abreast of the latest advancements in multimodal learning and vision-language systems, applying relevant discoveries to enhance practical outcomes.
• Publish research findings in prestigious AI conferences and journals as appropriate.
• Bachelor's degree in Computer Science, Machine Learning, or a related discipline; a Master's or PhD is preferred.
• Extensive experience with multimodal post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback.
• Practical experience with parameter-efficient fine-tuning and distributed training frameworks.
• Proven capability to develop and enhance vision-language models with quantifiable results on standard benchmarks or practical applications.
• Experience in adapting models for resource-constrained environments.
• Documented contributions to open-source projects in multimodal AI on platforms like GitHub or HuggingFace.
• Publications in leading AI conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ECCV, etc.).
• Remote work
• Flexible work hours
• Professional development opportunities
Brahma
Clariti
Aledade, Inc.
Geomagical Labs
Get handpicked remote jobs straight to your inbox weekly.