
System Engineer – Compute Node
Posted May 23

Posted May 23
This is a fully remote position, open to applicants in Netherlands.
• The Compute Node team is responsible for creating services that manage Virtual Machines operating on GPU servers, ensuring integration with disk management and both virtual and InfiniBand networks.
• Additionally, the team develops the Virtual Machine Scheduler, which functions across clusters containing thousands of servers and tens of thousands of GPUs, distributed across multiple data centers in various regions.
• Proficiency in Kubernetes is highly valued, not just for deployment but also for development.
• Familiarity with KubeVirt and QEMU/KVM is considered a significant advantage.
• A solid understanding of internal OS architecture, performance factors, process isolation, and resource management is essential.
• Knowledge of POSIX, sysfs, system calls, and file systems is required.
• Familiarity with server architecture, PCIe devices, NICs, and kernel drivers is important.
• Experience or a strong interest in GPUs, DPUs, or ARM architectures is desirable.
• Familiarity with the NVIDIA DOCA Software Framework will be beneficial.
• Proficiency in at least one of Go or C++, with a willingness to learn or work with the other as needed.
• A strong understanding of concurrency, debugging, and profiling techniques is necessary.
• Competitive salary and a comprehensive benefits package.
• Opportunities for professional advancement within Nebius.
• Flexible working arrangements.
• A dynamic and collaborative work environment that encourages initiative and innovation.
harrison.ai
Pavilion
State of Rhode Island
Get handpicked remote jobs straight to your inbox weekly.