This is a fully remote position, open to applicants in Hungary.

📋 Description

• Design and implement cutting-edge reinforcement learning algorithms aimed at enhancing decision-making processes in both simulated and real-world environments.

• Set explicit performance goals such as reward maximization and policy stability.

• Execute, manage, and observe controlled reinforcement learning experiments.

• Monitor key performance indicators while documenting iterative results and comparing them against predefined benchmarks.

• Identify and curate high-quality simulation environments and training datasets that are specifically tailored to address domain-specific challenges.

• Establish measurable criteria to ensure that the selection and preparation of these resources significantly improve the learning process and overall model performance.

• Systematically troubleshoot and optimize the reinforcement learning pipeline by assessing both computational efficiency and learning performance metrics.

• Tackle issues such as reward signal noise, exploration strategies, and policy divergence to enhance convergence and stability.

• Collaborate with cross-functional teams to incorporate reinforcement learning agents into production systems.

• Define clear success metrics such as enhancements in real-world performance and robustness under varying conditions, ensuring continuous monitoring and iterative updates for sustained domain adaptation.

⛳️ Requirements

• A degree in Computer Science or a related field.

• Preferably a PhD in NLP, Machine Learning, or a related discipline, accompanied by a strong record in AI R&D (including notable publications in A* conferences).

• Demonstrated experience with large-scale reinforcement learning experiments, particularly online RL techniques such as Group Relative Policy Optimization (GRPO), is essential.

• A profound understanding of reinforcement learning algorithms is required, including state-of-the-art online RL methods and other gradient-based optimization techniques like policy gradients, actor-critic, and GRPO.

• Strong proficiency in PyTorch and relevant reinforcement learning frameworks is mandatory.

• Practical experience in developing RL pipelines, from simulation and online training to post-training evaluation and deploying RL-based solutions in production settings is anticipated.

• Proven ability to apply empirical research to address reinforcement learning challenges such as sample inefficiency, exploration-exploitation trade-offs, and training instability.

• Skilled in designing robust evaluation frameworks and iterating on algorithmic innovations to continually advance RL agent performance.

🏝️ Benefits

• Work remotely from anywhere in the world.

AI Research Engineer – Reinforcement Learning

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

AI Research Engineer, Model Compression – Quantization

Clinical AI Research Lead

AI Research Engineer – Pre-training, LLM, Multi-Modal

Clinical AI Research Assistant

ML Researcher

AI Researcher

Never miss a great job!