Remotery

Senior GPU Infrastructure Engineer

Posted 11 hours ago

This is a fully remote position, open to applicants in California.

📋 Description

• Assist in the development and expansion of Hyperbolic's GPU Cloud Marketplace.

• Create a solution for multi-tenancy provisioning and virtualization.

• Convert raw GPUs sourced from various global suppliers into a programmable and orchestrated resource pool.

• Provide services to thousands of AI developers and researchers.

• Engage at the forefront of cloud infrastructure technology.

• Develop the essential orchestration layer that allows the platform to achieve up to 75% cost savings compared to conventional cloud service providers.


⛳️ Requirements

• Comprehensive knowledge of bare-metal provisioning and lifecycle management, encompassing IPMI/Redfish, BMC-based remote management, PXE boot, and automated operating system deployment workflows.

• In-depth understanding of GPU scheduling and orchestration, including awareness of GPU types, memory management, topology considerations, placement strategies for multi-GPU tasks, and minimizing fragmentation.

• Strong skills in infrastructure and DevOps engineering with expertise in Terraform or Pulumi, CI/CD for infrastructure, secrets management, configuration management, and implementation of observability stacks.

• Experience with storage and data infrastructure tailored for AI/ML workloads, such as object storage, high-IOPS block storage, and distributed file systems for training data and checkpoints.

• Proficiency in API design and cloud-init for automated provisioning and configuration tasks.

• Solid grasp of GPU architecture, CUDA, and GPU compute optimization techniques.

• A highly collaborative team player with outstanding communication skills that bridge technical and non-technical stakeholders.

• Proven capacity to effectively collaborate with hardware vendors and vendor engineering teams to troubleshoot problems and enhance integrations.

• Experience in building and scaling cloud infrastructure or distributed systems within production settings.


🏝️ Benefits

• Competitive salary and performance-based bonuses.

• Opportunities for professional development and career advancement.

• Flexible work hours and remote work options.

• Health, dental, and vision insurance plans.

• A supportive and inclusive work environment.

People also viewed

Anchor Utility11 hours ago

Rate Analyst

US flagTexas OnlyFull-timeUncategorized
ApplyView job
Honeywell11 hours ago

HSE Manager

US flagNorth Carolina OnlyFull-timeUncategorized
ApplyView job
Cision France11 hours ago

People Partner

CA flagCanada OnlyFull-timeUncategorized$85k/year
ApplyView job
Navigate Power11 hours ago

B2B Outside Sales Consultant

US flagPennsylvania OnlyFreelanceUncategorized$50k – $250k/year
ApplyView job
TELUS11 hours ago

Business Development Executive, Early Career – European Language Required

GB flagUnited Kingdom OnlyFull-timeUncategorized
ApplyView job
Gilead Sciences11 hours ago

Statistical Programmer II

US flagUnited States OnlyFull-timeUncategorized$107.2k – $138.7k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers