Remotery

Senior Software Engineer, NCCL, CUDA

Posted Jun 20

This is a fully remote position, open to applicants in California, +2 more states.

📋 Description

• Interact with our Cloud Service Providers (CSPs) to identify and resolve functional and performance challenges in NCCL and CUDA libraries.

• Assess and enhance the performance of multi-GPU workloads through techniques such as profiling, benchmarking, and tuning.

• Diagnose and address NCCL and NVSHMEM data transfer issues in multi-node clusters.

• Tackle CUDA porting challenges for customer workloads.

• Implement datacenter-specific scheduling and topologies to achieve optimal performance.

• Troubleshoot and resolve intricate issues related to GPU computation, memory, and data transport.

• Work collaboratively with customers to understand their unique workload integration challenges associated with NCCL and CUDA libraries, providing customized solutions that align with the NVIDIA ecosystem.

• Partner with Application Engineers (AE), Field Application Engineers (FAE), and solution architects to deliver cohesive customer solutions and create technical documentation.

• Collaborate with internal teams to assist customers in leveraging the latest innovations in CUDA and NCCL.


⛳️ Requirements

• Over 8 years of experience in system software validation.

• Strong C/C++ programming and debugging capabilities, with a background in CUDA development.

• In-depth knowledge of operating systems and datacenter system architecture.

• Proficient in performance optimization and profiling tools (e.g., Nsight, nvprof).

• Solid understanding of PCIe and NVLINK technologies.

• Familiarity with high-performance networking technologies such as InfiniBand and RoCE.

• Comprehensive grasp of computing, networking, and cloud deployment, particularly on bare-metal servers and virtual machines.

• Experience with containers, cloud provisioning, and scheduling tools like Docker, Kubernetes, SLURM, and Ansible.

• Bachelor’s or Master’s degree in Computer Engineering, Computer Science, or a related discipline (or equivalent experience).

• Strong communication skills and the ability to collaborate effectively with partner and customer teams.


🏝️ Benefits

• Equity opportunities

• Comprehensive benefits package

People also viewed

VPS9 hours ago

AWS Full Stack Developer

US flagTennessee OnlyFull-timeFull-stack Engineer$120k – $210k/year
ApplyView job
Tango9 hours ago

Principal Software Engineer

US flagUnited States OnlyFull-timeFull-stack Engineer$200k – $240k/year
ApplyView job
Influur9 hours ago

GTM – Marketing Engineer

CO flagColombia OnlyFull-timeFull-stack Engineer
ApplyView job
Salesloft9 hours ago

Principal Software Engineer, AI

US flagUnited States OnlyFull-timeFull-stack Engineer$1 – $100k/year
ApplyView job
VSolvit9 hours ago

Software Developer

US flagUnited States OnlyFull-timeFull-stack Engineer$110k – $165k/year
ApplyView job
Skillable9 hours ago

Senior Software Engineer

US flagArizona, +23 more statesFull-timeFull-stack Engineer$130k – $150k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers