
Senior AI Researcher – World Foundation Models
Posted 2 hours ago

Posted 2 hours ago
This is a fully remote position, open to applicants in California, +2 more states.
• Conduct research, implement, and validate modifications to model architecture and algorithms aimed at enhancing video generation fidelity, particularly focusing on human-centric quality.
• Investigate and prototype enhancements in spatial multimodal modeling, modality alignment, flow-based or diffusion-based video generation, and representations inspired by neural rendering to boost controllability and long-horizon consistency.
• Enhance training and inference efficiency through architectural innovations and post-training strategies, including compute/memory optimizations, distillation, pruning, and compression.
• Establish model training objectives that foster improvements in sim-to-real and real-to-sim generalization, particularly regarding human motion, contact, and interaction dynamics across both real-world and synthetic/simulation data.
• Create comprehensive, domain-specific benchmarks for assessing world foundation models, particularly in the generation and interpretation of world models that reason about video, simulation, and physical environments.
• Convert research findings into robust implementations such as training code, production-ready checkpoints, model integrations, and demonstrations that effectively illustrate capability enhancements across teams.
• A PhD in Computer Science, Graphics, Computer Engineering, or a closely related field (or equivalent experience).
• Over 8 years of applied research and/or industry experience in vision, graphics, or related ML domains or a similar area.
• More than 3 years of direct experience in designing, training, and evaluating generative models for image/video/audio, with a solid foundation in modern deep learning.
• Practical experience in enhancing generative models with an emphasis on perceptual quality and temporal stability, particularly in generating human subjects.
• Advanced skills in Python, PyTorch, C++, and CUDA, along with strong research-engineering practices (reproducibility, testing, profiling, experiment tracking).
• Experience in training and debugging large models within multi-GPU and/or multi-node environments and distributed training workflows.
• Working knowledge of inference/runtime bottlenecks and optimization strategies.
• A keen “eye for quality” and a passion for diagnosing visual artifacts (sharpness, texture detail, temporal stability, etc.) using perceptual metrics, human preference signals, or learned evaluators.
• Equity
• Benefits
Tether.to
Baylor Genetics
Tether.to
Brahma
Get handpicked remote jobs straight to your inbox weekly.