
AI Tech Lead
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Colombia.
• Acting as the Scrum Master and Delivery Lead for both AI teams, responsible for organizing and facilitating sprint planning, daily stand-ups, backlog grooming, and retrospectives.
• Protecting both teams from daily integration distractions by ensuring that the junior development team receives clear task definitions, structured schemas, and well-defined technical requirements.
• Balancing the rapid demands of AI prototyping with the structured pipeline stabilization cycles necessary for enterprise-level development.
• Managing cross-team dependencies and interface mapping to guarantee effective collaboration between the senior and junior engineering teams.
• Converting stringent architectural guidelines—such as network isolation, database connection limits, and cost containment—into actionable workflows for the engineering teams.
• Collaborating with Loftware Architects to ensure teams effectively utilize AWS services and data read replicas while maintaining corporate security boundaries, tenant isolation, and regional compliance.
• Leading technical review sessions to identify the optimal storage strategy (Amazon MemoryDB / Redis OSS / Valkey vs. pgvector vs. OpenSearch), weighing developer needs against enterprise infrastructure standards.
• Supervising evaluation frameworks for multi-step agent workflows to ensure consistent behavior and mitigate unhandled hallucinations.
• Ensuring that all data ingestion processes and internal tool-calling structures comply with type-safe validation layers, preventing malformed agent responses from disrupting downstream systems or exposing PII.
• Managing the centralized repository for system prompts, prompt caching strategies, and Amazon Bedrock configurations to guarantee optimal performance, token budgeting, and alignment with corporate policies.
• Collaborating with internal teams to establish and enforce robust CI/CD strategies for AI agents, ensuring that modifications to prompts, embeddings, or state-machine routing rules are deployed seamlessly without service interruptions.
• Contributing to operational protocols for addressing deployment failures mid-workflow, ensuring both teams design for idempotency to handle unexpected model degradation or pipeline failures effectively.
• Over 10 years of experience in Software Engineering and/or Technical Leadership.
• More than 3 years of experience leading AI/ML or high-throughput distributed systems teams.
• Demonstrated experience implementing agile methodologies (Scrum/Kanban) across multi-tiered or split engineering teams.
• Extensive hands-on architectural experience with LLMs and enterprise-scale systems.
• Experience collaborating with System Architects to manage AWS infrastructure usage, security measures, and resource provisioning.
• Familiarity with agentic orchestration frameworks (LangGraph, AWS Step Functions, or equivalent) at an architectural governance level.
• Working knowledge of Amazon Bedrock APIs, Guardrails, and Knowledge Base configurations.
• Understanding of vector retrieval strategies (pgvector, Amazon OpenSearch/Elasticsearch) and in-memory data stores (Amazon MemoryDB / Redis OSS / Valkey).
• Experience in designing for idempotency and stateful rollback in distributed AI pipelines.
• Strong stakeholder management abilities, with a background in negotiating architectural and infrastructure decisions on behalf of engineering teams.
• Practical implementation experience with Vercel AI SDK, LangGraph, or LlamaIndex.
• Opportunity for remote work.
• 13 floating holidays.
• 15 vacation days per year upon completion.
• Supportive working environment.
Credo AI
Get handpicked remote jobs straight to your inbox weekly.