
Principal AI Research Scientist β Post-Training, Alignment, Reinforcement Learning
Posted May 26

Posted May 26
This is a fully remote position, open to applicants in California, +3 more states.
β’ Conduct post-training activities for model development.
β’ Create innovative algorithms aimed at enhancing model reliability, controllability, and alignment.
β’ Design and execute experiments that influence model behavior.
β’ Collaborate with infrastructure teams to establish scalable and reproducible workflows.
β’ Contribute to publications, patents, and enhance visibility in external research.
β’ Lead human-in-the-loop evaluations ensuring high-quality annotations.
β’ Clearly communicate technical risks, limitations, and trade-offs to leadership.
β’ Extensive hands-on experience in reinforcement learning specifically for foundation models.
β’ Demonstrated experience in leading or mentoring technical research teams.
β’ Strong intuition regarding model behavior and the challenges of alignment.
β’ Experience in designing evaluation systems.
β’ PhD or equivalent industry research experience in ML, RL, AI, or a related discipline.
β’ Strong track record of publications in prominent ML or AI venues.
β’ Familiarity with large-scale training infrastructure and understanding of compute trade-offs.
β’ Health insurance coverage.
β’ Flexible work arrangements available.
β’ Opportunities for professional development.
Sophos
NVIDIA
Geomagical Labs
Molecule AG
Get handpicked remote jobs straight to your inbox weekly.