
Scientific AI Evaluation, Computational Problem Designer
Posted May 6

Posted May 6
• Design sophisticated computational challenges that necessitate the use of domain-specific scientific software.
• Develop tasks that evaluate both accurate execution (multi-step workflows, simulations) and strategic thinking (experiment design, deductions from incomplete data).
• Create problem setups, solution pathways, and mechanisms for validation.
• Adjust and enhance tasks based on model performance to meet targeted difficulty levels.
• Ensure that problems prioritize reasoning strategy over simple computational power.
• Proven expertise with at least one relevant scientific library, demonstrated through research, open-source contributions, or industry experience.
• Ability to work autonomously and adapt based on constructive feedback.
• Proficiency in Linux/terminal environments and remote computing configurations.
• Availability to commit at least 15–20 hours per week.
• Graduate-level knowledge (MS or PhD preferred) in a pertinent STEM discipline.
• Practical experience utilizing scientific software libraries for genuine research challenges.
• Strong programming skills in Python, including the development of computational workflows and validation tools.
• Capability to design complex problems that necessitate profound reasoning rather than superficial solutions.
• Understanding of edge cases, limitations, and practical issues related to scientific tools.
• Experience in diverse domains or with various tools.
• Background in evaluation frameworks or benchmarking processes.
• Experience in education, pedagogy, or the creation of problem sets.
• Familiarity with reproducible research methodologies and containerized environments.
• Opportunity to work on cutting-edge scientific challenges.
• Collaborative environment with like-minded professionals.
• Flexible working hours to accommodate personal schedules.
• Access to advanced scientific tools and resources.
NICE
Oxfam America
Volkswagen Group
Volkswagen Group
Get handpicked remote jobs straight to your inbox weekly.