
AI Research Engineer – Agentic Post-training
Posted 1 hour ago

Posted 1 hour ago
This is a fully remote position, open to applicants anywhere in the world.
• Execute comprehensive research and engineering projects to enhance post-training of agentic and tool-utilizing models in order to attain state-of-the-art (SOTA) outcomes.
• Propel extensive, cross-disciplinary model enhancements, focusing on areas such as factual accuracy, adherence to instructions, usage of tools/functions, coordination among multiple agents, and calibration of reasoning.
• Create and refine large-scale post-training systems, which encompass data pipelines, training workflows, evaluation frameworks, and benchmarking infrastructure.
• Establish robust evaluation suites and diagnostic instruments to gauge model readiness for deployment.
• Fortify feedback mechanisms from real-world product utilization, integrating both explicit and implicit user feedback into post-training processes.
• Partner with tooling, product, and training teams to enhance the utility, dependability, and agentic functionalities of cutting-edge models.
• Work closely with research, engineering, and cross-functional teams to identify which integrations are ready for production inclusion in significant model releases.
• Bachelor’s degree in Computer Science, Machine Learning, or a related discipline; an advanced degree (MS/PhD) is preferred, along with a solid publication history in leading AI conferences.
• Experience with multimodal post-training processes and data pipelines, especially for agentic systems and tool usage.
• Practical experience in applying post-training techniques at scale using distributed training frameworks (e.g., multi-node GPU settings).
• Proven ability to enhance model functionalities in aspects like reasoning, tool application, and multi-agent coordination that achieve SOTA performance.
• Established history of open-source contributions pertaining to agentic systems or tool utilization (including code, datasets, or models) on platforms like GitHub or Hugging Face.
• Publications in prominent AI conferences (e.g., NeurIPS, ICML, ICLR, ACL, CVPR, ECCV).
• Flexible work arrangements
• Professional development opportunities
PlexTrac
Tether.to
Tether.to
Insight Timer
Get handpicked remote jobs straight to your inbox weekly.