
Applied AI Engineer
Posted May 24

Posted May 24
This is a fully remote position, open to applicants in Costa Rica.
• Design, construct, and sustain evaluation pipelines for production AI agent systems.
• Integrate multi-agent workflows with tracing and observability tools.
• Create evaluation datasets utilizing actual production traffic and interaction logs.
• Develop quality and robustness scoring systems for LLM outputs.
• Enhance the reliability of AI systems dealing with non-deterministic model behavior.
• Implement and refine HITL (Human-in-the-Loop) escalation workflows.
• Investigate production failures and drive architectural enhancements.
• Manage the complete feedback loop encompassing evaluations, prompt optimization, architecture updates, and re-testing.
• Contribute to strategies for prompt engineering and model optimization.
• Collaborate on decisions regarding multi-agent orchestration and workflow reliability.
• Work across backend systems, deployment pipelines, monitoring, and operational support.
• Engage in production support and on-call duties.
• Uphold high engineering standards concerning scalability, observability, and maintainability.
• Function autonomously across development, testing, deployment, and production ownership.
• Over 5 years of backend or AI engineering experience in production settings.
• Significant hands-on experience with production LLM or agentic AI systems.
• Proven ability to debug and maintain non-deterministic AI workflows under live conditions.
• Experience in building or managing evaluation/evals pipelines for AI systems.
• Strong comprehension of scorer design, feedback loops, and AI system evaluation methodologies.
• Excellent skills in Python backend engineering.
• Production experience with frameworks such as FastAPI, Django, Celery, LangGraph, or similar orchestration tools.
• Familiarity with observability and tracing tools including Langfuse, Grafana, Loki, OpenTelemetry, or equivalent.
• Experience in deploying and managing distributed backend systems.
• Strong understanding of AI reliability, prompt behavior, and handling model failures.
• Ability to independently manage projects from start to finish.
• Experience collaborating with asynchronous remote teams.
• Strong written communication skills in English.
• Fully Remote
• LATAM-friendly collaboration preferred
Credo AI
Get handpicked remote jobs straight to your inbox weekly.