Remotery

Senior Linux Kernel Engineer – High-Performance Computing

Posted May 25

This is a fully remote position, open to applicants in Netherlands.

📋 Description

• Optimizing the performance of clusters and InfiniBand networks to guarantee peak functionality in HPC and GPU-centric environments.

• Investigating and diagnosing the underlying causes of issues pertaining to GPUs and InfiniBand networks, and recommending corrective measures.

• Incorporating new hardware into the current infrastructure, including enabling support for new GPU hardware via software stacks such as Kubernetes, QEMU, and KVM.

• Advancing automation systems for proactive monitoring, identifying, and resolving complications in GPU and InfiniBand settings.

• Setting up and overseeing GPU devices and InfiniBand fabrics to ensure effective and dependable operation.


⛳️ Requirements

• Over 5 years of professional experience in system-level software development, emphasizing performance optimization and low-level programming.

• More than 3 years of practical experience with Linux systems, including administration, troubleshooting, and/or performance tuning.

• Proficient with essential tools for kernel profiling and tuning, including perf, ftrace, and (e)BPF.

• Comprehensive knowledge of server architecture, encompassing PCIe devices, NICs, Linux OS/Kernel, etc.

• Strong command of one or more performance-focused programming languages such as C/C++, Go, or Python.

• It would be advantageous (though not essential) if you possess:

• Experience in GPU end-to-end testing within a cluster setup utilizing InfiniBand networking.

• A proven history of analyzing and enhancing the performance of HPC workloads, including simulations, data analysis, and AI/ML tasks.

• Familiarity with RDMA, RoCE, and InfiniBand protocols for high-performance communication.

• Background in Software-Defined Networking (SDN) along with experience in HPC cluster networking.

• Understanding of QEMU/KVM virtualization and management of virtualized environments.

• Experience with deep learning frameworks like PyTorch and TensorFlow, and their integration into HPC systems.

• Knowledge of collective communication libraries such as MPI and NCCL for distributed computing.


🏝️ Benefits

• Flexible working arrangements

• A dynamic and collaborative work environment that encourages initiative and innovation.

People also viewed

Webedia11 hours ago

Staff Engineer – API & Data

DE flagGermany OnlyFull-timeFull-stack Engineer
ApplyView job
TechBiz Global11 hours ago

Senior AI Product Engineer

GR flagGreece OnlyFull-timeFull-stack Engineer
ApplyView job
The Flex11 hours ago

Full-Stack Engineering Lead

FR flagFrance OnlyFull-timeFull-stack Engineer
ApplyView job
Nodeworthy11 hours ago

Full Stack Developer

SG flagSingapore OnlyFull-timeFull-stack Engineer$4,000 – $6,000/month
ApplyView job
GoTo11 hours ago

Senior Software Engineer

HU flagHungary OnlyFull-timeFull-stack Engineer
ApplyView job
Squirro11 hours ago

Senior Software Engineer – Knowledge Graph, GraphRAG

CH flagSwitzerland OnlyFull-timeFull-stack Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers