
Research Scientist, LLM Evaluation – Post-Training
Posted 4 hours ago

Posted 4 hours ago
This is a fully remote position, open to applicants in California, +1 more state.
• Establish and implement a comprehensive research agenda centered on LLM evaluation and post-training, with a focus on model enhancement driven by evaluation.
• Create experiments to investigate how different evaluation methodologies influence fine-tuning and post-training results.
• Develop and authenticate extensive evaluation frameworks for both LLM and multimodal systems.
• Spearhead research in cutting-edge evaluation areas, including long-context, cross-modal, and dynamic multi-turn evaluations.
• Examine model behaviors and identify failure patterns; provide actionable insights for model enhancement.
• Collaborate with Language Data Scientists to incorporate human-in-the-loop and synthetic data/evaluation methodologies.
• A Master's or PhD in Computer Science, Machine Learning, Statistics, Applied Mathematics, AI, or a related quantitative discipline (PhD is highly preferred).
• Over 5 years of pertinent experience in applied ML research or scientific research, with significant work in LLMs or foundational models (graduate research is applicable).
• Proven experience with LLM evaluation, benchmarking, alignment, post-training processes, or model quality research.
• A solid understanding of experimental design, statistical analysis, and scientific reasoning applicable to ML systems.
• Proficient in Python programming for research experimentation, data processing, evaluation pipelines, statistical analysis, and visualization.
• Practical experience with contemporary ML frameworks (PyTorch, Hugging Face, JAX/TensorFlow).
• Options for remote work.
• Opportunities for professional development.
Jade Biosciences
Sophos
SandboxAQ
SandboxAQ
Get handpicked remote jobs straight to your inbox weekly.