
AI Research Engineer – Multi-Modal, Vision
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in India.
• Perform comprehensive research and engineering on vision-language models, encompassing training, evaluation, and optimization throughout the entire model development lifecycle.
• Create and execute post-training pipelines that include supervised fine-tuning, knowledge distillation, and reinforcement learning based on human feedback.
• Develop and uphold high-quality multimodal datasets, which involve data curation, filtering, and balancing for specific domain tasks.
• Enhance model efficiency and deployability by adapting models for environments with limited resources through compression and optimization strategies.
• Design and implement evaluation frameworks and benchmarks to assess model performance, robustness, and success in real-world tasks.
• Construct and scale training workflows across distributed GPU infrastructures.
• Identify and mitigate bottlenecks in training pipelines to attain state-of-the-art model quality on targeted benchmarks.
• Contribute to and utilize open-source ecosystems, including models, datasets, and tools, to expedite development.
• Keep abreast of the latest advancements in multimodal learning and vision-language systems, applying relevant findings to achieve practical enhancements.
• Publish research outcomes in leading AI conferences and journals when applicable.
• Bachelor’s degree in Computer Science, Machine Learning, or a related discipline; MS/PhD is preferred.
• Extensive experience with multimodal post-training workflows, including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback.
• Practical experience with parameter-efficient fine-tuning and distributed training frameworks.
• Proven ability to build and enhance vision-language models with measurable outcomes on established benchmarks or real-world applications.
• Experience in adapting models for environments with limited resources.
• Documented contributions to open-source projects in multimodal AI on platforms like GitHub or HuggingFace.
• Publications in prominent AI conferences (NeurIPS, ICML, ICLR, CVPR, ECCV, etc.).
• Flexible working arrangements
• Professional development opportunities
Brahma
Aledade, Inc.
Clariti
Geomagical Labs
Get handpicked remote jobs straight to your inbox weekly.