
Senior Deep Learning Tools Engineer – CUDA Tile
Posted 14 hours ago

Posted 14 hours ago
This is a fully remote position, open to applicants in California, +2 more states.
• Create and develop performance testing frameworks tailored for deep learning compilers and workloads.
• Establish and manage automated pipelines (CI/CD) to consistently monitor performance across various models, hardware, and compiler modifications.
• Implement benchmarking systems to evaluate latency, throughput, and efficiency of AI and HPC workloads.
• Examine performance trends over time to identify regressions, bottlenecks, and opportunities for optimization.
• Collaborate with compiler and architecture teams to troubleshoot and resolve performance challenges.
• Create tools and dashboards for performance visualization, reporting, and generating insights.
• Facilitate scalable testing across a variety of GPU systems and environments.
• Enhance infrastructure to guarantee reliable, reproducible, and high-quality performance data.
• Bachelor’s, Master’s, or PhD degree (or equivalent experience) in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or a related discipline.
• Over 5 years of software engineering experience, with a strong focus on performance engineering, benchmarking, or systems optimization.
• Proficient programming skills in Python (C++ knowledge is an advantage).
• Experience with CI/CD systems and automation frameworks.
• Familiarity with hardware-aware performance analysis, including GPUs, accelerators, or similar systems.
• Experience with deep learning frameworks such as PyTorch, TensorFlow, JAX, or TensorRT.
• Background in data analysis, profiling, and regression tracking.
• Capability to debug intricate system-level problems across both software and hardware layers.
• Competitive salaries.
• Comprehensive benefits package.
• Equity options.
Shermco Industries
Bart & Associates, Inc.
Owens Corning
Get handpicked remote jobs straight to your inbox weekly.