
Senior Performance Compiler Engineer – Triton
Posted May 10

Posted May 10
This is a fully remote position, open to applicants in California, +2 more states.
• Conducting research on the latest and upcoming NVIDIA GPU hardware architectures and programming methodologies.
• Pioneering advancements in AI by analyzing sophisticated algorithms (such as attention mechanisms and Mixture of Experts) and numerical methods (including block-scaled floating point) to uncover new optimization opportunities.
• Crafting and executing compiler technology utilizing MLIR to enhance high-level kernel descriptions (written in Triton's Python DSL), concentrating on producing efficient, low-level GPU code.
• Participating in a vibrant, iterative optimization process—occasionally commencing with the kernel and at other times with the compiler—to discover the most effective route to achieving peak performance.
• Collaborating with various teams across NVIDIA, including hardware architects and the CUDA compiler team, to shape future products and ensure we consistently operate at optimal efficiency.
• Bachelor’s, Master’s, or Ph.D. degree, or equivalent experience in Computer Science, Computer Engineering, Applied Mathematics, or a related discipline.
• Over 8 years of pertinent industry experience in software development.
• Proven expertise in C++ programming and software design, with a focus on performance evaluation and troubleshooting.
• Proficient in parallel programming, including CUDA/OpenCL GPU programming or other parallel frameworks such as OpenMP.
• Comprehensive understanding of computer architecture and practical experience with assembly-level programming.
• Equity
• Health insurance
INDEPTH HYGIENE SERVICES LIMITED
Terabase Energy
Get handpicked remote jobs straight to your inbox weekly.