Remotery

AI Engineer

atIn TandemUS flagMinnesotaFull-timeAI EngineerMid-levelSenior$100k – $135k/year

Posted 5 days ago

This is a fully remote position, open to applicants in Minnesota.

📋 Description

• Manage and enhance our self-hosted inference infrastructure.

• Operate the inference serving layer on our dedicated GPU hardware: select and fine-tune the serving stack (vLLM, SGLang, TensorRT-LLM) to achieve high throughput and minimal latency.

• Optimize rigorously: implement tensor parallelism, quantization (FP8, AWQ, GPTQ), KV-cache and prefix caching, continuous batching, and concurrency tuning.

• Serve various models and features from shared hardware: multi-LoRA, routing, and request scheduling to balance internal workloads with latency-sensitive product traffic.

• Maintain the speed, efficiency, and observability of our AI systems.

• Enhance the efficiency of our AI workloads: reduce latency, increase throughput, and optimize GPU utilization to maximize resource usage.

• Establish visibility: instrument performance and usage metrics across our AI platforms to provide clear insights into operational performance.

• Highlight technical trade-offs (performance, latency, efficiency) to equip decision-makers with the necessary information.

• Develop AI features and proactive agents.

• Deliver the in-app agent layer designed to assist families with coordination: offering proactive nudges, smart suggestions, and agents that summarize, draft, schedule, and act on behalf of busy parents.

• Create the underlying infrastructure: tools, memory management, orchestration, guardrails, and evaluation harnesses, seamlessly integrated with production APIs in collaboration with our architecture team.

• Collaborate closely with feature owners, quickly building whatever is necessary to test ideas, including a vibe-coded UI when it provides the fastest route to customer feedback. Embrace rapid iteration: ship rough drafts, learn swiftly, and refine what proves effective.


⛳️ Requirements

• 5+ years of experience in delivering production software, with significant applied AI or ML expertise.

• Proven experience in running and optimizing self-hosted LLMs on dedicated multi-GPU hardware: familiarity with a serving stack (vLLM, SGLang, or TensorRT-LLM) and associated optimization techniques (tensor parallelism, quantization, batching, KV cache).

• A solid history of enhancing inference performance and efficiency (latency, throughput, GPU utilization).

• Strong proficiency in Python and engineering principles, with the capability to quickly develop a UI and a genuine interest in app-layer features, not just infrastructure.

• Practical experience with agent frameworks (Claude Agent SDK, LangGraph, or equivalents), LLM APIs, embeddings, and RAG.

• Familiarity with AWS and the associated DevOps responsibilities of this role: Docker, CI/CD, monitoring, and observability.

• Experience in building internal tools or platforms relied upon by others, with a bonus for experience in Slack apps, MCP, or agent orchestration at team scale.


🏝️ Benefits

• Medical: In Tandem covers 100% of the premium for employees and 99% for additional family members.

• 401k: Offers up to a 4% match with immediate vesting.

• Paid leave for all new parents.

• Learning & Development stipend available for employees.

• Paid Time Off: 11 Holidays + Winter Break (3 Days) + Volunteer Time Off (1 Day) + Floating Holiday (1 Day).

• Personal Time Off: 15 days for employees with 0-1 years of service, increasing to 20 days for those with 1-3 years of service.

• Supportive and flexible work environment – work from anywhere!

People also viewed

Omada Health9 hours ago

Senior Applied AI Engineer

US flagCalifornia, +2 more statesFull-timeAI Engineer$200.6k – $250.7k/year
ApplyView job
NineTwoThree Studio9 hours ago

ML Engineer – Applied AI

UA flagUkraine OnlyFull-timeAI Engineer
ApplyView job
Stride, Inc.9 hours ago

AI Engineer

US flagUnited States OnlyFull-timeAI Engineer$66.4k – $145k/year
ApplyView job
KeyBank9 hours ago

Agentic AI Lead

US flagOhio OnlyFull-timeAI Engineer$116k – $216k/year
ApplyView job
Primefold9 hours ago

AI-Native Product Engineer

DE flagGermany OnlyFull-timeAI Engineer
ApplyView job
ENSCO, Inc.9 hours ago

Enterprise AI Architect

US flagUnited States OnlyFull-timeAI Engineer$150k – $200k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers