
Data Scientist
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in United States.
• Engage at the nexus of Artificial Intelligence and Threat Research.
• Collaborate intimately with cybersecurity subject-matter experts to gain insights into analyst workflows and their security operations protocols.
• Conduct post-training of LLMs and agents through supervised fine-tuning and reinforcement learning (RLHF/RLAIF, PPO/GRPO/DPO, reward modeling) to automate analyst procedures and enhance reliability on actual security tasks.
• Develop AI agents and integrate them into progressively intricate workflows, including planning and reasoning loops, tool and function invocation, and retrieval and memory management.
• Investigate innovative methods for agentic planning and prototype cutting-edge techniques from existing literature.
• Set objective benchmarks for evaluating agentic systems, including evaluations, LLM-as-judge pipelines, and trajectory-level metrics, ensuring statistical rigor.
• Refine prompts and inference methods to maximize the potential of each model.
• Collaborate and coordinate with Engineering, Data Science, and Managed Services teams, working alongside engineers to advance prototypes to production.
• Stay abreast of advancements in the field of Artificial Intelligence and assist in identifying, defining, and prioritizing research areas.
• Strong foundational knowledge in machine learning, probability, and statistics, with an intuitive grasp of uncertainty, statistical skew/variance, and experimental design.
• PhD-level understanding in contemporary machine learning research; while a doctorate is not mandatory, equivalent expertise is expected, including the ability to read, critique, implement, and enhance existing papers.
• Proven experience in training generative models, with a solid grasp of LLM training fundamentals (architecture, optimization, tokenization, data, and scaling behavior).
• Reinforcement learning and post-training as a core competency: RLHF/RLAIF, policy optimization (PPO/GRPO/DPO), reward modeling, and constructing RL environments for agents.
• Demonstrated experience in building agentic systems, including agent architectures (ReAct, planning, reflection), tool and function invocation, and retrieval/memory/context management.
• Familiarity with systematic prompt optimization and designing and constructing evaluations for LLM systems.
• Proficient with GPUs, PyTorch, and the standard LLM training and serving stack (e.g., Hugging Face Transformers/TRL/PEFT, DeepSpeed/FSDP, vLLM/TGI/SGLang).
• Strong, reproducible research engineering capabilities: clean Python coding and disciplined experiment tracking that allows collaborators to build upon.
• Ability to independently navigate ambiguous and complex objectives, with clear communication within a large project team.
• Leading market compensation and equity awards.
• Comprehensive programs for physical and mental wellness.
• Competitive vacation and holiday policies to promote recharging.
• Paid parental and adoption leave.
• Professional development opportunities available for all employees, regardless of level or role.
• Employee Networks, local community groups, and volunteer opportunities to foster connections.
• Dynamic office culture featuring world-class amenities.
• Recognized as a Great Place to Work Certified™ globally.
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.