
AI Research Engineer – Kernel & Inference Optimization
Posted May 20

Posted May 20
This is a fully remote position, open to applicants in Italy.
• Foster innovation in model serving and inference architectures
• Enhance model deployment and inference methodologies
• Develop resource-efficient models suitable for constrained hardware
• Create resilient inference pipelines
• Set up comprehensive performance evaluation metrics
• Detect and address bottlenecks within production environments
• A degree in Computer Science or a related discipline
• Preferably a PhD in NLP, Machine Learning, or a similar area
• Familiarity with Metal Shading Language (MSL)
• Demonstrated experience in low-level kernel optimizations
• Strong proficiency in writing GPU kernels for mobile platforms
• Hands-on experience in creating and deploying end-to-end inference pipelines
• In-depth knowledge of contemporary model serving architectures
• Experience with Distributed Inference Systems and techniques such as Tensor Parallelism
• Comprehension of advanced optimization techniques
• Work remotely from any location globally
• Opportunity for collaboration with international teams
• Engage in cutting-edge projects within the fintech sector
Tether.to
Insight Timer
Tether.to
Get handpicked remote jobs straight to your inbox weekly.