This is a fully remote position, open to applicants in Brazil.

📋 Description

• Assume responsibility for the ML API that provides NBA recommendations, collaborating with the data engineer who has developed it, and enhance it for low-latency production traffic.

• Deliver your first agent tool contract from start to finish: schema design, handler implementation, structured-error contract, unit tests, deployed via HAL's runtime.

• Establish the evaluation foundation for our agents: golden transcripts, rubric-based judges, and regression suites that execute on every prompt or model modification.

• Cultivate a strong working relationship with HAL and become the primary resource for the data team regarding agent infrastructure decisions.

• Serve as the main owner (with support from a data engineer) of the ML API and the agent tool layer that integrates NBA and our ML models.

• Have successfully launched at least one production-grade agent (customer-facing or partner-facing) featuring prompt versioning, evaluations, observability, and multi-tenant gating.

• Define the data team's guidelines for deploying a new ML model as an LLM-callable tool, from start to finish.

• Mentor data engineers on ML/AI patterns, enabling them to confidently support and enhance the systems you manage.

• Act as the technical lead within the data team for NBA production AI at Clutch — the individual other teams seek when they want to understand how NBA responsibly deploys ML and agents.

• Have demonstrably improved agent cost and latency (target: over 30% reduction on P95 latency or per-conversation cost on at least one agent).

• Influence the data team's roadmap for the next generation of ML and AI products, in collaboration with the PM and data scientist.

• Assist in determining future hires as the team expands.

⛳️ Requirements

• Over 7 years of engineering experience, with a solid history of building and deploying production ML systems — you've transitioned models from prototype to production and take ownership of their post-deployment performance.

• Proficient in Python — most tasks (ML training, evaluation, the ML API, data pipelines) are performed in Python, and you are comfortable working within production codebases, not just notebooks. Some TypeScript will be used for tool contracts and integration with our agent runtime — you don't need to be an expert, but familiarity with a second language is necessary.

• Tool-design discipline for LLM consumption. You can take an ML model or data source and transform it into an LLM-callable tool with narrow input/output schemas, identity-required and scope-gated dispatch, and structured-error contracts (RATE_LIMITED, UPSTREAM_ERROR, NOT_FOUND) that the agent runtime converts to graceful tool results instead of failures.

• Evaluation discipline for non-deterministic systems. You consider evaluations as the unit-test equivalent for agents: golden transcripts, rubric-based judges, regression suites that run on every prompt or model change. You recognize the distinction between offline metrics and online evaluations and utilize both.

• Proficient in prompt shaping. You analyze a system prompt as another engineer would analyze code: audience, register, compliance guardrails, template-variable allow-list, allowed-tools section. You troubleshoot "why did the agent do that?" by examining the prompt and tool descriptions before considering model changes. You have launched at least one agent where the prompt was version-controlled and reviewed as code.

• Rigor in tool implementation. You create handlers behind tool contracts with identity fields derived from request context (never from LLM-supplied arguments), output re-parsed through the tool's schema before returning, structured-error notifications on every failure path, and unit tests covering both successful outcomes and each named error. You have an experience about a tool you deployed, a bug that production traffic revealed, and how you fortified it.

• Experience in building and maintaining low-latency production APIs (FastAPI, BentoML, or equivalent), with informed opinions on serving, batching, and caching strategies.

• Comfortable with AWS (especially Lambda), Docker, and GitHub-based workflows.

• Actively employ AI tools in your engineering workflow — not as a novelty, but as a standard practice. You will be expected to demonstrate this during the technical evaluation.

🏝️ Benefits

• Remote Flexibility: Enjoy the option to work remotely from any location, seamlessly balancing your professional and personal life.

• Unforgettable Off-Sites: Twice a year, connect with colleagues in exciting destinations, promoting teamwork and innovative ideas.

• Paid Time Off and National Holidays: Benefit from 20 PTO days annually along with National Holidays for relaxation and rejuvenation.

• Stock Options: By joining us, you will have a stake in our success, receiving stock options as part of your compensation package.

• Home Office Setup: Create your ideal workspace with a dedicated budget for home office essentials.

• Work Trip Budget: Advance both personally and professionally with a budget allocated for work-related travels and co-working.

Senior ML Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior Machine Learning Engineer

Machine Learning Engineer

Senior AI/ML Engineer

Machine Learning Engineer

Senior MLOps Engineer

Senior Data/ML Engineer

Never miss a great job!