This is a fully remote position, open to applicants in Florida.

📋 Description

• Design and implement multi-step agentic systems (planner/executor, tool-using, multi-agent, human-in-the-loop) for onboarding, underwriting, case review, and ongoing monitoring.

• Create agent graphs in LangGraph (or similar frameworks — CrewAI, AutoGen, Claude Agent SDK) with clear state management, durable execution, retries, and secure fallbacks.

• Develop the retrieval layer that supports our agents — chunking, hybrid search, reranking, and grounded citation.

• Manage the evaluation stack: golden sets, offline regression suites, LLM-as-judge, online A/B and shadow evaluations, and red-teaming for vulnerabilities like jailbreaks, prompt injection, and PII leakage.

• Integrate agents with production systems through well-typed tools and MCP servers, treating the tool surface area as a product.

• Lead production MLOps efforts: deployment, versioning, traffic shaping, cost/latency management, tracing, and on-call playbooks for agent-related incidents.

• Collaborate with security and compliance teams to ensure agents comply with SOC 2, GDPR, CCPA, and fair-lending standards, embedding auditability and explainability into the design.

• Mentor engineers on agent patterns, prompt hygiene, evaluation discipline, and understanding LLM failure modes.

⛳️ Requirements

• Over 5 years of software engineering experience, with at least 2 years focused on developing production LLM or agentic systems (beyond just notebooks or demos).

• Practical experience with a contemporary agent framework (LangGraph is strongly preferred) and a proven history of deploying agents that operate, fail gracefully, and recover effectively.

• Solid understanding of RAG fundamentals such as chunking, embeddings, hybrid retrieval, reranking, grounding — along with the ability to discern when RAG is not the optimal solution.

• Genuine evaluation experience with golden sets, both offline and online evaluations, used to inform ship/no-ship decisions.

• Proficiency in production MLOps: managing deployed LLM workloads under real latency, cost, and reliability constraints.

• Strong skills in Python; comfortable working with TypeScript / Node.js.

• Robust systems engineering instincts, including APIs, asynchronous patterns, queues, databases, and distributed system failure modes.

• Effective communicator; thrives in ambiguous and fast-paced environments.

• Previous experience in fintech, lending, payments, KYB/KYC, fraud detection, or AML.

• Experience in building MCP servers or other structured tool interfaces for LLMs.

• Background in classical machine learning (ranking, scoring, calibration).

• Experience in designing explainable and auditable AI workflows for regulated environments.

• Contributions to open-source projects related to agent frameworks, evaluation tools, or retrieval libraries.

• Proficient in AWS services (EKS, MSK, RDS, S3, Lambda) and Infrastructure as Code (IaC) using Terraform.

🏝️ Benefits

• Health Care Plan (Medical, Dental & Vision)

• Retirement Plan (401k, IRA)

• Life Insurance

• Flexible Paid Time Off

• 9 paid Holidays

• Family Leave

• Remote work opportunities

• Hybrid work model (for Orlando Associates)

• Complimentary Food & Snacks (Orlando)

• Wellness Resources

Senior Agentic, AI Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Co-Founder, CEO – AI Retail Planning Autopilot

AI Delivery Lead

Director, AI Accreditation – Assurance Programs

Senior Director Analyst – AI in HR Strategy and Transformation

AI Engagement Lead, CPG/FMCG

AI GRC Platform Engineer

Never miss a great job!