Remotery

Senior Storage Production Engineer – DGX Cloud

atNVIDIAUS flagCaliforniaFull-timeUncategorizedSenior$176k – $333.5k/year

Posted 1 hour ago

This is a fully remote position, open to applicants in California.

πŸ“‹ Description

β€’ Design, implement, and provide support for large-scale storage clusters, ensuring they are scalable, highly available, and maintain data integrity.

β€’ Create and uphold storage monitoring, logging, and alerting systems to facilitate proactive detection and resolution of performance issues.

β€’ Collaborate with AI/ML workloads to enhance storage architectures for low-latency access, efficient caching, and high-throughput performance.

β€’ Enhance the lifecycle of storage services – from conception and design to deployment, operation, and ongoing optimization.

β€’ Assist in the preparation of storage services prior to their availability by engaging in system build consulting, developing automation frameworks, managing capacity, and conducting launch reviews.

β€’ Oversee the production storage infrastructure by monitoring availability, latency, and system health, utilizing predictive analytics and AI-driven automation.

β€’ Maximize storage efficiency through compression, deduplication, tiering strategies, and intelligent workload allocation.

β€’ Sustainably scale storage systems using AI/ML-driven automation, policy-based tiering, and dynamic data migration techniques.

β€’ Guarantee data security and compliance by implementing encryption, access controls, and auditing mechanisms for storage systems.

β€’ Engage in sustainable incident response and blameless root cause analysis.

β€’ Participate in an on-call rotation to support storage and production systems.


⛳️ Requirements

β€’ Bachelor's degree or equivalent experience in Computer Science, Storage Systems, or a related technical field, with a minimum of 8 years of practical experience.

β€’ Proficient in distributed and high-performance storage solutions, including clustered and parallel file systems, distributed object storage, and enterprise-grade storage systems.

β€’ Strong understanding of block, file, and object storage technologies, including their scalability, reliability, performance characteristics, and standard processes.

β€’ Familiarity with storage networking protocols such as NFS, SMB, iSCSI, S3, Fibre Channel, RDMA, and NVMe over Fabrics.

β€’ Expertise in algorithms, data structures, complexity analysis, software design, and automating the maintenance of large-scale Linux-based storage systems.

β€’ Experience in one or more programming languages such as C/C++, Java, Python, Go, NodeJS, and Bash for storage automation, monitoring, and performance tuning.

β€’ Practical experience with infrastructure configuration management tools like Ansible, Chef, Puppet, and Terraform for automating storage deployments.

β€’ Proficient in using observability and tracing tools like InfluxDB, Prometheus, Grafana, and the Elastic stack for monitoring the health of storage systems.


🏝️ Benefits

β€’ Equity

β€’ Benefits

People also viewed

Instacart7 min ago

Program Manager II

US flagCalifornia, +18 more statesFull-timeUncategorized$122k – $155k/year
ApplyView job
CLASP7 min ago

Senior Product Manager – Candidate & Recruiter Platform

US flagMassachusetts OnlyFull-timeUncategorized$140k – $170k/year
ApplyView job
Tevora7 min ago

Account Director

US flagOregon OnlyFull-timeUncategorized$110k – $130k/year
ApplyView job
Tailor7 min ago

Forward-Deployed Product Manager – FDPM

US flagCalifornia OnlyFull-timeUncategorized$130k – $170k/year
ApplyView job
Cube Care Company7 min ago

Human Resource Generalist

US flagUnited States OnlyFull-timeUncategorized
ApplyView job
Juniper Square7 min ago

Product Marketing Engineer

US flagUnited States OnlyFull-timeUncategorized$160k – $215k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers