
AI Platform Engineer – OneAI
Posted May 22

Posted May 22
This is a fully remote position, open to applicants in Spain.
• Design, implement, and deploy sophisticated AI functionalities within the OneAI platform.
• Enhance the end-user experience by creating intuitive workflows for model management, deployment configuration, and job operations.
• Optimize the model lifecycle by integrating public repositories (e.g., Hugging Face) for smooth discovery, import, versioning, and deployment.
• Connect systems engineering with product design to facilitate a seamless transition from backend infrastructure to user features.
• Incorporate state-of-the-art AI frameworks and engines, such as vLLM, NVIDIA Dynamo, and Unsloth, into a secure and scalable environment.
• Utilize OpenNebula to orchestrate high-performance inference and training workloads across various cloud and edge environments.
• Develop and maintain dependable APIs for compute provisioning and workload scheduling.
• Implement GPU-aware operations to ensure optimal resource allocation and hardware utilization.
• Create comprehensive observability suites to monitor and track essential metrics, including latency, throughput, utilization, and failure rates.
• Establish and enhance deployment and workflow strategies to ensure AI workloads remain efficient and stable at scale.
• Optimize system architecture to achieve a balance between high performance and cost efficiency.
• Research and integrate cutting-edge AI tools and engines to keep the OneAI platform at the forefront of the industry.
• Analyze performance bottlenecks to improve the efficiency of both training and inference processes.
• Bachelor’s or Master’s degree in Computer Science, Information Technology, or Engineering.
• Over 3 years of experience in applied AI, machine learning, or software engineering, with practical delivery of AI/ML solutions in production settings.
• Proven experience in designing and deploying high-performance AI infrastructure, focusing on the scalability and reliability of inference and training workloads.
• Demonstrated success in deploying Large Language Models (LLMs) at scale, with extensive knowledge of serving engines (e.g., vLLM) and fine-tuning tools (e.g., Unsloth).
• Experience in building AI-centric platforms or toolchains that manage the model lifecycle (versioning, deployment, and discovery).
• Familiarity with GPU orchestration and optimizing workloads for cloud, distributed, or large-scale environments, as well as collaborating with platform or infrastructure teams.
• Hands-on experience with high-throughput inference engines (e.g., vLLM) and fine-tuning tools (e.g., Unsloth).
• Proficient in integrating with the Hugging Face ecosystem (Transformers, Hub, Datasets) for model and data management.
• Experience in implementing monitoring tools to track system-level AI metrics such as token throughput, latency, GPU utilization, and failure rates.
• Experience in designing and implementing scalable, reliable APIs for compute provisioning and workload scheduling.
• Experience working with cloud platforms and containerized environments (e.g., OpenNebula, Kubernetes).
• Advanced English level (B2 or higher) is required.
• Competitive compensation package and flexible remuneration: Meals, Transport, Nursery/Childcare.
• Customized workstation options (macOS, Windows, Linux).
• Private health insurance coverage.
• Paid time off, including Holidays, Personal Time, Sick Time, and Parental leave.
• Shortened workday every Friday and during the summer months.
• Remote work environment with a vibrant headquarters located in Madrid; offices in Boston (USA), Brussels (Belgium), and Brno (Czech Republic); access to office space near your location as needed.
• Healthy work-life balance: We promote Digital Disconnecting and encourage harmony between employees' personal and professional lives.
• Flexible hiring options available: Full Time/Part Time, Employee (Spain/USA) / Contractor (other locations).
MAINSOFT
World Vision
Block Labs
Attio
Get handpicked remote jobs straight to your inbox weekly.