This is a fully remote position, open to applicants in India.

📋 Description

• Responsible for establishing and implementing the technical quality assurance strategy for Agentic AI solutions.

• Concentrate on testing intricate orchestrators and sub-agent architectures developed in Python using open-source frameworks.

• Ensure the reliability, precision, and efficiency of multi-agent systems while optimizing operational costs and system performance.

• Design and create comprehensive test suites tailored for multi-agent architectures.

• Develop automated monitors and test cases to track token usage and identify as well as mitigate redundant calls.

• Conduct thorough performance testing to assess and enhance end-to-end latency.

• Construct and maintain automated evaluation pipelines utilizing metrics to validate LLM outputs.

• Evaluate the decision-making capabilities of the orchestrator and manage edge cases effectively.

• Create Python-based automation frameworks for handling non-deterministic AI outputs.

• Integrate AI-specific testing gates into DevOps pipelines.

⛳️ Requirements

• Total Experience: 5–9 years in Software Development Engineer in Test (SDET) or Quality Engineering positions.

• AI/LLM Experience: At least 1+ years of hands-on experience in testing LLM-based applications, RAG pipelines, or Agentic workflows.

• Framework Experience: Demonstrated experience with AWS Bedrock Agent Core and/or Strands. Comparable experience with LangChain, LangGraph, LlamaIndex, or Google ADK (Agent Development Kit) is also highly acceptable.

• Agentic Systems: Direct experience in constructing or testing systems involving multi-agent coordination, tool-use (function calling), and autonomous planning.

• Cloud Experience: Strong familiarity with AWS services (Lambda, CloudWatch, Bedrock) or equivalent services from Google Cloud/Azure AI.

• High proficiency in Python, including experience in asynchronous programming.

• In-depth understanding of agentic patterns (ReAct, Plan-and-Execute) and the intricacies of testing non-deterministic systems.

• Capability to analyze logs and traces to pinpoint bottlenecks in agent reasoning and propose cost-saving measures in prompt design or model selection.

• Proficiency with Pytest and experience with observability/tracing tools such as LangSmith, AWS Cloudwatch, or AWS X-Ray.

• Knowledge of NLP and LLM evaluation techniques, including employing "LLM-as-a-judge" for assessing complex sub-agent outputs.

• Exceptional analytical skills for diagnosing "hallucinations" or logical errors during the orchestrator’s planning phase.

• Strong verbal and written communication skills, with the ability to effectively convey technical risks associated with AI performance and costs to stakeholders.

🏝️ Benefits

• Health insurance

• 401(k) matching

• Flexible work hours

• Paid time off

• Professional development opportunities

Staff Engineer – SDET, Pytest, Agentic Systems

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

QA Test Engineer

Senior QA Automation Engineer

SDET / QA Automation Engineer – Fintech, Web3

Middle QA Automation Engineer, Python

Software Test Engineer – Security Clearance

Field Service Technician, Field Service

Never miss a great job!