Remotery

Research Lead – Principal Scientist, Manager Post-Training, Alignment, Reinforcement Learning

Posted May 13

This is a fully remote position, open to applicants in Canada.

📋 Description

• Take charge of the post-training strategy for model development — encompassing RLHF, preference optimization, agentic systems, and long-horizon reasoning.

• Create innovative algorithms that enhance model reliability, controllability, and alignment.

• Make informed architectural choices regarding when to tackle challenges at the pre-training, post-training, or system level.

• Design and conduct experiments that influence model behavior, robustness, and reasoning quality.

• Collaborate with infrastructure teams to develop scalable and reproducible post-training workflows.

• Contribute to publications, patents, and increase Autodesk's visibility in external research.

• Develop evaluation frameworks for long-horizon reasoning, tool usage, agentic behavior, safety, and real-world workflow completion.

• Lead comprehensive model analysis and interpretability initiatives.

• Oversee human-in-the-loop evaluations with high annotation quality and robust scientific methodology.

• Define model readiness criteria and offer go/no-go recommendations for releases.

• Manage, mentor, and cultivate a team of AI scientists.

• Establish technical direction and research priorities for post-training and alignment projects.

• Promote a research culture that emphasizes scientific rigor, reproducibility, and rapid iteration.

• Assist in recruiting top-tier talent in ML, RL, alignment, and foundation models.

• Collaborate closely with pre-training teams, infrastructure, product organizations, and other stakeholders.

• Convert research trade-offs into clear, actionable guidance for leadership.


⛳️ Requirements

• Extensive hands-on experience in reinforcement learning for foundation models, along with proficiency in post-training methods (RLHF, RLAIF, DPO, PPO, or similar approaches).

• Demonstrated experience in leading or mentoring technical research teams — whether in an academic lab, AI research institution, or industry environment.

• Strong instinct for model behavior, alignment challenges, and post-training trade-offs.

• Experience in designing evaluation systems and rigorously considering what constitutes model readiness.

• Ability to effectively communicate complex technical trade-offs to both technical and non-technical audiences.

• A PhD or equivalent level of industry research experience in ML, RL, AI, or a related discipline.


🏝️ Benefits

• Health insurance

• Retirement plans

• Paid time off

• Flexible work arrangements

• Professional development

• Bonuses

• Stock options

• Equipment allowances

• Wellness programs

People also viewed

Thermo Fisher Scientific1 day ago

Staff Research Scientist – Reinforcement Learning

US flagCalifornia OnlyFull-timeResearch Scientist$200k – $250k/year
ApplyView job
SandboxAQJun 27

Senior Research Scientist, Battery Materials Simulation

US flagUnited States OnlyFull-timeResearch Scientist$134.4k – $252k/year
ApplyView job
Jade BiosciencesJun 27

Principal Scientist, Immunology

US flagCalifornia, +1 more stateFull-timeResearch Scientist$175k – $190k/year
ApplyView job
Roland BergerJun 26

Chargé.e de Recherche et d'Analyse, Stagiaire fin d'études

FR flagFrance OnlyFull-timeResearch Scientist
ApplyView job
SandboxAQJun 26

Research Scientist, Battery Materials Simulation

US flagUnited States OnlyFull-timeResearch Scientist$112k – $210k/year
ApplyView job
iLoF - Intelligent Lab on FiberJun 26

Senior Scientist – Mathematical & Physics Signals Modeling, Experimental Interface

PT flagPortugal OnlyFull-timeResearch Scientist
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers