This is a fully remote position, open to applicants in Belgium.

📋 Description

• Design and enhance our multi-agent orchestration platform, currently based on Hermes / Multica.

• Create and implement voice AI pipelines, including STT (VibeVoice-ASR, Whisper), real-time TTS with streaming (VibeVoice-Realtime), VAD (Silero), and SIP/RTP telephony integration, targeting sub-300 ms end-to-end latency.

• Develop and sustain RAG pipelines that include retrieval quality measurement, re-ranking, and hybrid search utilizing both vector and keyword indexes.

• Establish MCP server architecture and define tool-use agreements for both internal and third-party integrations.

• Fine-tune and assess LLMs (LoRA, QLoRA, DPO) for specific tasks in domains such as customer support, classification, and structured extraction.

• Manage the AI observability stack, incorporating Langfuse tracing, span-level LLM call instrumentation, cost tracking, and quality regression alerts.

• Set and enforce safeguards including hallucination detection, PII redaction, output safety scanning, and rate-limiting across multi-tenant deployments.

• Develop data ingestion, preprocessing, and feature pipelines that support model training and continuous learning.

• Lead CI/CD efforts for ML, including automated evaluation gating, shadow deployments, canary releases, and rollback triggers.

• Establish architectural standards for AI systems across the organization; conduct design reviews and own architectural decision records for significant choices.

• Mentor ML engineers and applied scientists, enhancing the team's skills in production AI beyond just prototype development.

• Collaborate with Product and Commercial teams to convert business challenges into ML problem formulations with well-defined success metrics.

• Engage with external research collaborators and monitor emerging developments (arXiv, conference proceedings, open-source releases) to pinpoint signals that are worth productionizing.

⛳️ Requirements

• Over 8 years of experience in ML Engineering, Applied AI, or Research Engineering, with a minimum of 2 years in a lead or staff-level position.

• Extensive, hands-on experience with LLMs in production environments, including fine-tuning, RLHF/DPO, prompt engineering, RAG, and tool utilization.

• Proficient in Python and the essential ML stack: PyTorch, Transformers (HuggingFace), PEFT/LoRA.

• Practical experience with LLM inference serving technologies such as vLLM, TensorRT-LLM, or TGI in latency-sensitive production settings.

• Strong knowledge of agentic frameworks, including multi-agent coordination, tool-call orchestration, context/memory management, and observability (Langfuse, Opik, or similar).

• Experience in speech AI (ASR/TTS pipelines) or real-time audio systems is highly desirable.

• Solid grasp of MLOps concepts, including experiment tracking (MLflow/W&B), model registries, containerization (Docker/Kubernetes), and CI/CD practices for ML.

• Awareness of specific risks associated with LLMs, such as hallucination, prompt injection, data leakage, fairness, and privacy, along with strategies for mitigation in production.

• Excellent communication abilities, capable of drafting concise design documents, conducting effective architecture reviews, and explaining trade-offs to non-technical stakeholders.

🏝️ Benefits

• Health insurance

• 401(k) matching

• Flexible working hours

• Paid time off

• Remote work options

Principal AI Solutions Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior Solutions Architect, Customer Success

Solutions Architect

AI Solutions Engineer – Document Intelligence, Generative AI

SAP S/4HANA Solution Architect

Solution Architect

Business Systems Solutions Manager

Never miss a great job!