Remotery

Staff Test Engineer – AI

Posted May 25

This is a fully remote position, open to applicants in India.

📋 Description

• Define and oversee the comprehensive testing strategy for Outreach’s GenAI platform, which encompasses agentic workflows, LLM tool calls, LangGraph orchestration, and associated ML pipelines.

• Design and implement evaluation systems capable of handling both deterministic and non-deterministic outputs.

• Take ownership of testing across Outreach’s collection of AI agents.

• Collaborate closely with Data Science, MLOps, and platform engineers to ensure that testability is integrated from the outset.

• Incorporate evaluation pipelines into CI/CD workflows.

• Establish and monitor key metrics relevant to AI systems, including answer quality scores, tool invocation accuracy, hallucination rates, latency, and regression trends associated with model and prompt changes.

• Set standards for AI testing throughout the organization—covering prompt regression testing, retrieval quality evaluation, and agent behavior contracts.

• Elevate quality across engineering teams by mentoring engineers.

• Proactively monitor advancements in AI evaluation tools, LLM benchmarking, and testing research.


⛳️ Requirements

• 7–12 years of experience in software development and/or test automation, with a proven track record of leading quality initiatives on complex, distributed systems.

• B.S. in Computer Science or a related technical discipline.

• Strong programming capabilities in Python, with experience in developing reusable and maintainable test frameworks.

• Demonstrated experience in testing large-scale backend or platform systems, including microservices and API layers.

• In-depth understanding of test design principles, CI/CD integration, and scalable test automation.

• Familiarity with test frameworks such as PyTest or their equivalents.

• Comprehensive knowledge of evaluation methodologies for non-deterministic systems, including statistical assertions, behavioral testing, and regression baselines.

• Practical experience with Databricks for constructing and validating ML pipelines and data workflows.

• Experience with MLflow for experiment tracking, model versioning, and pipeline observability.

• Excellent communication and collaboration skills across engineering, data science, and product teams.


🏝️ Benefits

• We’re an equal opportunity employer. All applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity

People also viewed

Uvation10 hours ago

QA Test Engineer

RO flagRomania OnlyPart-timeSoftware Development Engineer in Test (SDET)
ApplyView job
Zartis1 day ago

Senior QA Automation Engineer

EuropeFull-timeSoftware Development Engineer in Test (SDET)
ApplyView job
Bitrefill2 days ago

SDET / QA Automation Engineer – Fintech, Web3

SE flagSweden OnlyFull-timeSoftware Development Engineer in Test (SDET)
ApplyView job
Miratech2 days ago

Middle QA Automation Engineer, Python

QA flagQatar OnlyFull-timeSoftware Development Engineer in Test (SDET)
ApplyView job
Work Life Group2 days ago

Software Test Engineer – Security Clearance

NL flagNetherlands OnlyFull-timeSoftware Development Engineer in Test (SDET)
ApplyView job
lean GmbH2 days ago

Field Service Technician, Field Service

DE flagGermany OnlyFull-timeSoftware Development Engineer in Test (SDET)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers