
Senior AI Engineer
Posted 2 days ago

Posted 2 days ago
This is a fully remote position, open to applicants in United States.
• Design, develop, and manage enterprise AI systems within our client portfolio.
• Oversee the complete AI stack, from inference engines and platform infrastructure to application-level engineering.
• Lead the comprehensive design, development, and management of AI systems on AI Factory platforms across various client initiatives.
• Engineer and optimize LLM inference serving stacks, primarily focusing on vLLM while also covering the broader inference ecosystem to meet client latency, throughput, and cost objectives.
• Enhance inference performance through KV cache management, paged attention, batching strategies, and Dynamo-based disaggregated serving.
• Design and manage MLOps pipelines encompassing model lifecycle, registries, deployment, rollback, and observability.
• Create and engineer RAG applications utilizing vector databases.
• Develop and optimize prompt-engineering patterns at a production scale.
• Engineer high-performance storage and networking solutions for AI workloads.
• Manage Kubernetes clusters that support AI workloads.
• Build and maintain container images, registries, and CI/CD pipelines for AI/ML services.
• Implement monitoring, alerting, logging, and capacity planning across the AI stack.
• Secure environments to comply with client security and regulatory requirements.
• Lead troubleshooting efforts across diverse environments and technologies.
• Directly engage with client stakeholders—both technical and executive—to communicate updates, root causes, options, and recommendations.
• Mentor and review code from junior engineers, elevating the technical standards of every engagement you participate in.
• Author runbooks, reference architectures, and knowledge base content; lead client knowledge transfer and enablement sessions.
• Participate in on-call rotations and incident response for production AI workloads.
• Contribute reusable patterns, tools, and reference designs back to the practice.
• Over 7 years of experience in software, data, or infrastructure engineering, with at least 3 years specifically focused on modern AI / LLM systems.
• Proficient in production-quality Python at an engineering level—capable of testing, code review, version control, and delivering code that others rely on.
• Extensive production experience with Linux, including system internals, performance tuning, and troubleshooting.
• Advanced expertise in Docker—covering image building, registry management, runtime tuning, and container security.
• Strong server-platform knowledge including CPU/GPU architectures, PCIe, BMC management, BIOS/firmware lifecycle, and physical-to-logical troubleshooting.
• Hands-on experience with deploying and managing one or more of HPE PCAI, Dell AI Factory, or Nutanix Enterprise AI.
• Practical experience in deploying, tuning, and operating vLLM.
• Familiarity with multiple inference and model-serving frameworks beyond vLLM, with the capability to select and optimize the appropriate tool for each workload.
• Hands-on experience with high-throughput, low-latency storage and network infrastructures for AI workloads—including RDMA-class interconnects, parallel/object storage tiers, KV cache management, and Dynamo-style disaggregated serving.
• Practical experience in operating MLOps tools and methodologies—model registries, deployment pipelines, GitOps, lineage, and rollback.
• Experience in deploying, tuning, and integrating vector databases and RAG pipelines, alongside the application-level engineering that supports them.
• Proven experience designing system prompts, structured outputs, function calling, and tool-using LLM patterns.
• Demonstrated ability to create LLM evaluation frameworks—golden sets, regression suites, and quality/cost metrics.
• Proven ability to engage directly with client stakeholders—facilitating working sessions, presenting recommendations, and translating technical details for non-technical audiences.
• Excellent written and verbal communication skills—creating clear reference architectures, runbooks, and incident reports.
• A history of mentoring junior engineers and enhancing team technical quality through code reviews and collaborative work.
• Knowledge of TCP/IP, DNS, load balancing, VLANs, and firewall management.
• Comfortable working across multiple concurrent client environments and managing competing priorities under SLA.
• Competitive salary and performance-based bonuses.
• Comprehensive health, dental, and vision insurance.
• Generous paid time off and flexible working arrangements.
• Opportunities for professional development and continuous learning.
• Collaborative and innovative work environment.
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.