Remotery

Research Lead – Principal Scientist, Manager Post-Training, Alignment, Reinforcement Learning

Posted Jun 21

This is a fully remote position, open to applicants in Canada.

📋 Description

• Take ownership of the post-training strategy for model development, encompassing RLHF, preference optimization, agentic systems, and long-horizon reasoning.

• Create innovative algorithms that enhance model reliability, controllability, and alignment.

• Make informed architectural decisions regarding the timing of addressing challenges at the pre-training, post-training, or system level.

• Design and conduct experiments to influence model behavior, robustness, and reasoning quality.

• Collaborate with infrastructure teams to develop scalable and reproducible post-training workflows.

• Contribute to publications, patents, and enhance Autodesk's visibility in external research.

• Create evaluation frameworks focused on long-horizon reasoning, tool utilization, agentic behavior, safety, and completion of real-world workflows.

• Lead thorough model analysis and interpretability initiatives.

• Facilitate human-in-the-loop evaluations with high-quality annotations and rigorous scientific methodology.

• Define model readiness criteria and offer go/no-go recommendations for releases.

• Oversee, mentor, and develop a team of AI scientists.

• Establish technical direction and research priorities for post-training and alignment projects.

• Cultivate a research culture rooted in scientific rigor, reproducibility, and rapid iteration.

• Assist in recruiting top-tier talent in ML, RL, alignment, and foundational models.

• Work closely with pre-training teams, infrastructure, product organizations, and other stakeholders.

• Convert research trade-offs into clear, actionable guidance for leadership.


⛳️ Requirements

• Extensive hands-on experience in reinforcement learning for foundational models, alongside proficiency in post-training methodologies (RLHF, RLAIF, DPO, PPO, or similar techniques).

• Demonstrated experience in leading or mentoring technical research teams, whether in an academic lab, AI research organization, or industry environment.

• Strong intuition regarding model behavior, alignment challenges, and post-training trade-offs.

• Experience in designing evaluation systems and rigorously considering what it entails for a model to be deemed ready.

• Capability to clearly articulate complex technical trade-offs to both technical and non-technical audiences.

• A PhD or equivalent depth of industry research experience in ML, RL, AI, or a related discipline.


🏝️ Benefits

• Health insurance

• Retirement plans

• Paid time off

• Flexible work arrangements

• Professional development

• Bonuses

• Stock options

• Equipment allowances

• Wellness programs

People also viewed

LexisNexis4 hours ago

US Legal Editor, AI Content Updating

US flagNew York OnlyFull-timeUncategorized$59.1k – $118.3k/year
ApplyView job
Futures4 hours ago

Freelance Career Coach

AR flagArgentina OnlyFreelanceUncategorized$99/year
ApplyView job
Hunt St4 hours ago

Mechanical Services Estimator

PH flagPhilippines OnlyFreelanceUncategorized$2,000 – $3,000/month
ApplyView job
CRC Insurance Services4 hours ago

Senior Claim Specialist – Prime Specialty

US flagNew York OnlyFull-timeUncategorized$120k – $140k/year
ApplyView job
ANI Pharmaceuticals, Inc.4 hours ago

Acute Care Specialist

US flagNew York OnlyFull-timeUncategorized$140k – $170k/year
ApplyView job
EXL4 hours ago

DRG Trainer

US flagUnited States OnlyFull-timeUncategorized$85k – $110k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers