
Machine Learning Applications and Compiler Engineer, New College Grad
Posted 1 day ago

Posted 1 day ago
This is a fully remote position, open to applicants in Canada.
• Design, develop, and sustain high-performance runtime and compiler components, focusing on optimizing end-to-end inference.
• Establish and implement mappings for large-scale inference tasks onto NVIDIA’s systems.
• Enhance and integrate within NVIDIA’s software ecosystem, contributing to libraries, tools, and interfaces that facilitate seamless model deployment across various platforms.
• Benchmark, profile, and monitor essential performance and efficiency metrics to ensure the compiler produces effective mappings of neural network graphs to our inference hardware.
• Work closely with hardware architects and design teams to provide software insights, influence future architectures, and co-design features that unlock new performance and efficiency levels.
• Prototype and assess novel compilation and runtime techniques, including graph transformations, scheduling methods, and memory/layout optimizations tailored for spatial processors.
• Publish and present technical findings on innovative compilation techniques for inference and related spatial accelerators at leading ML, compiler, and computer architecture conferences.
• Currently pursuing or recently earned a MS or PhD in Computer Science, Electrical/Computer Engineering, or a related field, or possess equivalent experience.
• Strong background in software engineering, with familiarity in systems-level programming (e.g., C/C++ and/or Rust) and solid computer science fundamentals in data structures, algorithms, and concurrency.
• Practical experience in compiler or runtime development, including IR design, optimization passes, or code generation.
• Experience working with LLVM and/or MLIR, including the creation of custom passes, dialects, or integrations.
• Knowledge of deep learning frameworks such as TensorFlow and PyTorch, along with experience utilizing portable graph formats like ONNX.
• Understanding of parallel and heterogeneous computing architectures, such as GPUs, spatial accelerators, or other domain-specific processors.
• Strong analytical and debugging skills, with experience leveraging profiling, tracing, and benchmarking tools to enhance performance.
• Excellent communication and collaboration abilities, with a proven capacity to work across hardware, systems, and software teams.
• Eligible for equity
• Health insurance
• Stock options
• Paid time off
• Professional development opportunities
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.