
Solutions Architect – AI/ML, Training, GPU Infra
Posted May 19

Posted May 19
This is a fully remote position, open to applicants in Netherlands.
• Become part of a dynamic AI infrastructure team focused on large-scale machine learning workloads.
• Create and validate production-level distributed training and extensive inference architectures utilizing large GPU clusters.
• Troubleshoot, enhance, and expand machine learning workloads across multi-node GPU setups.
• Serve as a technical expert on GPU performance and networking.
• Work collaboratively with engineering, product, and research & development teams.
• Practical experience in designing and managing enterprise-level, production-quality, multi-node GPU workloads for training (model sizes of 7B and above) or inference.
• Strong expertise in distributed deep learning frameworks (such as PyTorch Distributed, DeepSpeed, etc.) on GPU clusters.
• Comprehensive knowledge of GPU architecture and interconnect technologies (H100/A100 class, NVLink, InfiniBand).
• Familiarity with Kubernetes or Slurm.
• Experience in performance optimization utilizing GPU profiling and monitoring tools.
• Health insurance.
• Flexible work arrangements.
• Opportunities for professional development.
Intetics
Remote
GitLab
NVIDIA
Get handpicked remote jobs straight to your inbox weekly.