Research Lead – Principal Scientist, Manager – Alignment, Reinforcement Learning

This is a fully remote position, open to applicants in Germany.

📋 Description

• Take charge of the post-training strategy for model development.

• Create innovative algorithms that enhance model reliability, controllability, and alignment.

• Design and conduct experiments that influence model behavior, robustness, and reasoning quality.

• Oversee, mentor, and develop a team of AI scientists.

⛳️ Requirements

• In-depth hands-on experience in reinforcement learning for foundational models.

• Proficient in post-training techniques (RLHF, RLAIF, DPO, PPO, or similar methodologies).

• Demonstrated experience in leading or mentoring technical research teams.

• Strong understanding of model behavior, alignment issues, and post-training trade-offs.

• Experience in designing evaluation systems.

• Capacity to convey complex technical trade-offs effectively.

🏝️ Benefits

• Comprehensive benefits package.