
Member of Engineering – Reinforcement Learning Infrastructure
Posted Jun 3

Posted Jun 3
This is a fully remote position, open to applicants in Europe.
• Stay updated with the latest research and have a solid understanding of the current advancements in LLMs, RL, and code generation.
• Create techniques for optimizing training and inference processes to achieve high throughput.
• Design data control systems within an RL pipeline that determine what the model observes and when.
• Identify and troubleshoot instances where infrastructure choices are adversely affecting learning dynamics.
• Develop observability tools that highlight when a system-level issue is the underlying cause of a training regression.
• Contribute to the construction of robust, adaptable, and scalable RL pipelines.
• Enhance performance across the entire stack, including networking, memory, compute scheduling, and I/O.
• Produce high-quality, practical code.
• Collaborate with the team: plan future actions, engage in discussions, and maintain constant communication.
• Proven experience with LLMs and workflows following model training.
• Knowledge of Reinforcement Learning principles and awareness of its primary challenges.
• Strong foundation in software engineering (testing, code reviews, debugging complex systems).
• Proficient in Python, with expertise in concurrency, asynchronous programming, multiprocessing, and performance enhancement.
• Familiarity with deep learning frameworks (such as PyTorch or JAX) and RL workflows (rollouts, replay buffers, policy updates).
• Experience in designing and maintaining distributed RL training systems.
• Background in large-scale LLM training infrastructure.
• Proficient with profiling tools across the stack (e.g., py-spy).
• Familiarity with inference stacks (e.g., vLLM).
• Preferred: Contributions to open-source RL or distributed ML projects.
• Fully remote work with flexible hours.
• 37 days of vacation and holidays each year.
• Health insurance allowance for you and your dependents.
• Equipment provided by the company.
• Wellbeing, continuous learning, and home office allowances.
• Regular team gatherings.
• A diverse and inclusive people-first culture.
Spread Tecnologia
Adistec
Get handpicked remote jobs straight to your inbox weekly.