
Senior Systems HPC Engineer
Posted May 20

Posted May 20
This is a fully remote position, open to applicants in Netherlands.
• Concentrate on gaining insights into system behavior across various layers, pinpointing performance bottlenecks, and spearheading enhancements that influence the construction, operation, tuning, and validation of our clusters.
• Analyze and resolve performance challenges of GPU clusters under actual workloads (both training and inference).
• Assess and incorporate new hardware, system configurations, and tuning strategies via the software stack.
• Assist in intricate performance-related escalations from both internal teams and customers.
• Collaborate closely with infrastructure, software engineering, and hardware vendor teams (such as NVIDIA, Mellanox, Intel).
• Play a role in hardware and cluster qualification (acceptance), ensuring systems align with performance standards.
• Over 5 years of professional experience in system-level software development, with an emphasis on performance optimization and low-level programming.
• More than 3 years of hands-on experience with Linux systems, encompassing administration, troubleshooting, and performance tuning.
• Comprehensive understanding of server architecture, including PCIe devices, NICs, the Linux OS/Kernel, and high-performance computing (HPC) systems.
• Strong expertise in one or more performance-driven programming languages (C/C++, Go, Python).
• Competitive salary and a thorough benefits package.
• Opportunities for career advancement within Nebius.
• Flexible working arrangements.
• A vibrant and collaborative work environment that encourages initiative and innovation.
TechBiz Global
ALTEN
Seekerh
Get handpicked remote jobs straight to your inbox weekly.