This is a fully remote position, open to applicants in Florida.

📋 Description

• Design and implement multi-step agentic systems (planner/executor, tool-using, multi-agent, human-in-the-loop) for onboarding, underwriting, case review, and ongoing monitoring.

• Create agent graphs in LangGraph (or similar platforms such as CrewAI, AutoGen, Claude Agent SDK) featuring explicit state management, durable execution, retries, and secure fallbacks.

• Develop the retrieval layer that supports our agents, including chunking, hybrid search, reranking, and grounded citation.

• Own the evaluation framework: golden sets, offline regression suites, LLM-as-judge, online A/B and shadow evaluations, and red-teaming efforts for jailbreaks, prompt injection, and PII leakage.

• Integrate agents with production systems through well-defined tools and MCP servers, treating the tool surface area as a product.

• Lead production MLOps: deployment, versioning, traffic shaping, cost/latency management, tracing, and on-call playbooks for agent-related incidents.

• Collaborate with security and compliance teams to ensure agents align with SOC 2, GDPR, CCPA, and fair-lending standards—embedding auditability and explainability into the process from the start.

• Mentor engineers on agent patterns, prompt hygiene, evaluation discipline, and LLM failure modes.

⛳️ Requirements

• A minimum of 5 years of software engineering experience, including at least 2 years focused on building production-level LLM or agentic systems (beyond notebooks or demos).

• Hands-on experience with a contemporary agent framework (LangGraph is highly preferred) and a proven track record of delivering agents that operate, fail gracefully, and recover effectively.

• Strong understanding of RAG fundamentals such as chunking, embeddings, hybrid retrieval, reranking, grounding—and the judgment to know when RAG is not the best approach.

• Real evaluation experience with golden sets, both offline and online evaluations, used to inform ship/no-ship decisions.

• Proficiency in production MLOps: managing LLM workloads under actual latency, cost, and reliability constraints.

• Strong proficiency in Python; comfortable working in TypeScript / Node.js.

• Solid systems engineering instincts regarding APIs, asynchronous patterns, queues, databases, and distributed system failure modes.

• Effective communicator; thrives in fast-paced, ambiguous environments.

• Previous experience in fintech, lending, payments, KYB/KYC, fraud detection, or AML.

• Experience in building MCP servers or other structured tool interfaces for LLMs.

• Background in classical machine learning (ranking, scoring, calibration).

• Experience in designing explainable and auditable AI workflows for regulated environments.

• Contributions to open-source projects related to agent frameworks, evaluation tools, or retrieval libraries.

• Depth of knowledge in AWS services (EKS, MSK, RDS, S3, Lambda) and Infrastructure as Code using Terraform.

🏝️ Benefits

• Health Care Plan (Medical, Dental & Vision)

• Retirement Plan (401k, IRA)

• Life Insurance

• Flexible Paid Time Off

• 9 paid Holidays

• Family Leave

• Remote work options

• Hybrid work arrangements (for Orlando Associates)

• Complimentary Food & Snacks (Orlando)

• Wellness Resources

Senior Agentic, AI Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

AI Evaluator, Polish

Forward Deployed Engineer – AI Revenue Agents

HQ AI Enablement Lead

Senior Talent Business Partner, Early Career – AI/ML PhD

AI/ML Manager

Director, Applied AI

Never miss a great job!