Remotery

Member of Engineering – Reinforcement Learning

Posted Jun 3

This is a fully remote position, open to applicants in Europe.

📋 Description

• Conduct research and experiments to enhance reasoning and code generation for Large Language Models (LLMs). Manage the entire experimental lifecycle from conception through experimentation to integration.

• Stay updated with the latest advancements and be knowledgeable about the cutting-edge developments in LLMs, Reinforcement Learning (RL), and code generation. Transform research concepts into clean, reusable codebases that can be utilized by other researchers.

• Design, evaluate, and refine data generation processes and the training of LLMs.

• Develop and improve RL training pipelines that are reliable across various domains.

• Identify and address training instabilities and failures, troubleshoot RL executions, and suggest mitigation strategies.

• Produce high-quality, reproducible, and maintainable code.


⛳️ Requirements

• Experience with Large Language Models (LLMs), which includes:

• Comprehension of the Transformer architecture and scaling laws.

• Familiarity with mid-training and post-training methodologies.

• Experience in training reasoning and/or agentic models.

• Practical experience with LLMs, understanding their capabilities and limitations.

• Background in Reinforcement Learning.

• Strong knowledge of Reinforcement Learning principles and awareness of modern algorithms.

• Experience in developing distributed, large-scale RL pipelines from data generation to evaluation.

• Research background.

• Scientific publications in areas such as Reinforcement Learning, LLMs, and reasoning models.

• Ability to engage in discussions about the latest research at a detailed level.

• Possesses well-informed opinions.

• Engineering expertise.

• Strong foundation in machine learning, algorithms, and engineering.

• Experience with distributed training.

• Proficient programming skills in Python.

• Familiarity with a deep learning framework such as Pytorch or JAX.


🏝️ Benefits

• Fully remote work with flexible hours.

• 37 days per year of vacation and holidays.

• Health insurance allowance for you and your dependents.

• Equipment provided by the company.

• Allowances for wellbeing, continuous learning, and home office setup.

• Regular team gatherings.

• A diverse, inclusive, and people-first culture.

People also viewed

Tether.to10 hours ago

Bare Developer

DK flagDenmark OnlyFull-timeSoftware Engineer
ApplyView job
SD Solutions10 hours ago

Mechanical Designer – Ventilation & Engineering

UA flagUkraine OnlyFull-timeSoftware Engineer
ApplyView job
SIS International Research & Strategy Consulting10 hours ago

Survey Programmer – Ops, Scripting

IN flagIndia OnlyFull-timeSoftware Engineer₹600k/year
ApplyView job
Roblox10 hours ago

Developer Engagement Representative – Part-Time Contract

TH flagThailand OnlyFreelanceSoftware Engineer
ApplyView job
CrowdStrike10 hours ago

Associate Curriculum Developer, Regional Training Lead – JAPAC

JP flagJapan OnlyFull-timeSoftware Engineer
ApplyView job
Leega1 day ago

Frontend Developer – Flutter (Mid-level)

Anywhere in the WorldFull-timeSoftware Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers