
AI Research Engineer – Pre-training, LLM, Multi-Modal
Posted 3 hours ago

Posted 3 hours ago
This is a fully remote position, open to applicants in Italy.
• Facilitate foundational pre-training processes for LLMs and Multi-Modal models on extensive, distributed server networks.
• Create, prototype, and enhance innovative architectures, tokenizers, and cross-modal alignment layers.
• Source, filter, and curate large-scale textual and multi-modal datasets.
• Execute experiments independently and collaboratively, analyze outcomes, and refine training methodologies.
• Explore, debug, and resolve efficiency bottlenecks in models.
• Contribute to the progress of distributed training systems.
• A degree in Computer Science or a related discipline.
• Preferably a PhD in NLP, Machine Learning, or a closely related field, with a robust record in AI research and development (including notable publications in A* conferences).
• Practical experience in contributing to large-scale LLM or Multi-Modal pre-training runs on extensive, distributed server infrastructures featuring thousands of NVIDIA GPUs.
• Familiarity and hands-on experience with large-scale, distributed training frameworks, libraries, and tools.
• In-depth knowledge of cutting-edge transformer and non-transformer modifications aimed at boosting intelligence, efficiency, and scalability.
• Strong proficiency in PyTorch and Hugging Face libraries, along with practical experience in model development, continual pretraining, and deployment.
• Flexible work arrangements.
• Professional development opportunities.
Baylor Genetics
Tether.to
NVIDIA
Brahma
Get handpicked remote jobs straight to your inbox weekly.