Remotery

Senior HPC Cluster Engineer

Posted May 25

This is a fully remote position, open to applicants in Czechia.

📋 Description

• Optimizing the performance of GPU clusters and InfiniBand networks to guarantee peak efficiency in HPC and GPU-centric settings.

• Investigating and diagnosing the root causes of issues associated with GPUs and InfiniBand networks, while recommending corrective measures.

• Incorporating new hardware into the current infrastructure, which includes supporting new GPU hardware via software frameworks like Kubernetes, QEMU, and KVM.

• Improving automation systems for proactive monitoring, identifying, and resolving issues within GPU and InfiniBand environments.

• Configuring and overseeing GPU devices and InfiniBand fabrics to ensure efficient and dependable performance.


⛳️ Requirements

• Over 5 years of professional experience in system-level software development with a focus on performance optimization and low-level programming.

• More than 3 years of practical experience with Linux systems, including administration, troubleshooting, and performance enhancement.

• Comprehensive understanding of server architecture, particularly PCIe devices, NICs, the Linux OS/Kernel, and high-performance computing (HPC) systems.

• Strong expertise in one or more performance-centric programming languages such as C/C++, Go, or Python.


🏝️ Benefits

• Competitive salary along with a comprehensive benefits package.

• Opportunities for professional advancement within Nebius.

• Flexible working arrangements available.

• A vibrant and collaborative work atmosphere that encourages initiative and innovation.

People also viewed

Akka (formerly Lightbend)12 hours ago

Forward Deployed Engineer

DE flagGermany OnlyFull-timeEngineer
ApplyView job
Swimlane1 day ago

Professional Services Engineer

IN flagIndia OnlyFull-timeEngineer$120k – $160k/year
ApplyView job
ITTConnect1 day ago

Senior Cisco CUCM Engineer

BR flagBrazil OnlyFull-timeEngineer
ApplyView job
Logicalis Spain1 day ago

Ingeniero de Observabilidad IA

ES flagSpain OnlyFull-timeEngineer
ApplyView job
Ohmium2 days ago

Field Services Engineer

HR flagCroatia OnlyFull-timeEngineer
ApplyView job
DeepHealth2 days ago

Technical Services Engineer

NL flagNetherlands OnlyFull-timeEngineer€35k – €50k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers