This is a fully remote position, open to applicants in Italy.

📋 Description

• Facilitate foundational pre-training processes for LLMs and Multi-Modal models on extensive, distributed server networks.

• Create, prototype, and enhance innovative architectures, tokenizers, and cross-modal alignment layers.

• Source, filter, and curate large-scale textual and multi-modal datasets.

• Execute experiments independently and collaboratively, analyze outcomes, and refine training methodologies.

• Explore, debug, and resolve efficiency bottlenecks in models.

• Contribute to the progress of distributed training systems.

⛳️ Requirements

• A degree in Computer Science or a related discipline.

• Preferably a PhD in NLP, Machine Learning, or a closely related field, with a robust record in AI research and development (including notable publications in A* conferences).

• Practical experience in contributing to large-scale LLM or Multi-Modal pre-training runs on extensive, distributed server infrastructures featuring thousands of NVIDIA GPUs.

• Familiarity and hands-on experience with large-scale, distributed training frameworks, libraries, and tools.

• In-depth knowledge of cutting-edge transformer and non-transformer modifications aimed at boosting intelligence, efficiency, and scalability.

• Strong proficiency in PyTorch and Hugging Face libraries, along with practical experience in model development, continual pretraining, and deployment.

🏝️ Benefits

• Flexible work arrangements.

• Professional development opportunities.

AI Research Engineer – Pre-training, LLM, Multi-Modal

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Lead Bioinformatics AI Scientist

AI Research Engineer – Multi-Modal, Vision

Senior AI Researcher – World Foundation Models

Machine Learning Researcher – Speech/Audio

Senior AI Researcher

Staff AI Researcher

Never miss a great job!