This is a fully remote position, open to applicants in India.

📋 Description

• Define and oversee the comprehensive testing strategy for Outreach’s GenAI platform, which encompasses agentic workflows, LLM tool calls, LangGraph orchestration, and associated ML pipelines.

• Design and implement evaluation systems capable of handling both deterministic and non-deterministic outputs.

• Take ownership of testing across Outreach’s collection of AI agents.

• Collaborate closely with Data Science, MLOps, and platform engineers to ensure that testability is integrated from the outset.

• Incorporate evaluation pipelines into CI/CD workflows.

• Establish and monitor key metrics relevant to AI systems, including answer quality scores, tool invocation accuracy, hallucination rates, latency, and regression trends associated with model and prompt changes.

• Set standards for AI testing throughout the organization—covering prompt regression testing, retrieval quality evaluation, and agent behavior contracts.

• Elevate quality across engineering teams by mentoring engineers.

• Proactively monitor advancements in AI evaluation tools, LLM benchmarking, and testing research.

⛳️ Requirements

• 7–12 years of experience in software development and/or test automation, with a proven track record of leading quality initiatives on complex, distributed systems.

• B.S. in Computer Science or a related technical discipline.

• Strong programming capabilities in Python, with experience in developing reusable and maintainable test frameworks.

• Demonstrated experience in testing large-scale backend or platform systems, including microservices and API layers.

• In-depth understanding of test design principles, CI/CD integration, and scalable test automation.

• Familiarity with test frameworks such as PyTest or their equivalents.

• Comprehensive knowledge of evaluation methodologies for non-deterministic systems, including statistical assertions, behavioral testing, and regression baselines.

• Practical experience with Databricks for constructing and validating ML pipelines and data workflows.

• Experience with MLflow for experiment tracking, model versioning, and pipeline observability.

• Excellent communication and collaboration skills across engineering, data science, and product teams.

🏝️ Benefits

• We’re an equal opportunity employer. All applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity

Staff Test Engineer – AI

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

QA Test Engineer

Senior QA Automation Engineer

SDET / QA Automation Engineer – Fintech, Web3

Middle QA Automation Engineer, Python

Software Test Engineer – Security Clearance

Field Service Technician, Field Service

Never miss a great job!