
Senior Network Solution Architect – AI Fabrics
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in California, +3 more states.
• Collaborate with AI-native and consumer internet clients on extensive GPU and networking deployments in data centers.
• Provide guidance on architectural choices involving network, compute, and storage, including fabric design.
• Assist with the on-site initiation of server, network, and cluster infrastructure within customer data centers.
• Showcase proficiency in advanced GPU and network systems (Spectrum-X, BlueField DPU, InfiniBand/RoCE, etc.) for key accounts.
• Conduct regular technical account reviews that cover roadmap alignment, cluster challenges, feature discussions, and the introduction of new technologies.
• Gather customer-specific needs and convert them into actionable feedback for product, architecture, and engineering teams.
• Analyze and troubleshoot configuration and performance challenges in RoCE and InfiniBand environments.
• Collaborate across NICs, switches, Linux, and system software to deliver efficient and dependable AI clusters.
• Identify and cultivate new project opportunities for NVIDIA GPUs, networking, and software in AI and data center applications.
• Work closely with Systems Engineering, Product Management, and Sales to ensure solutions align with customer outcomes.
• Develop targeted POCs that demonstrate the value of NVIDIA’s networking stack (e.g., Spectrum-X fabrics, BlueField DPUs) in actual customer settings.
• BS/MS/PhD in Electrical/Computer Engineering, Computer Science, or other Engineering fields, or equivalent experience.
• Over 6 years of practical network engineering experience in data center or cloud settings.
• Proven expertise in troubleshooting data center networks (packet-level, control plane, and fabric behavior).
• In-depth knowledge of protocols such as BGP, OSPF, and L2/L3 switching in large-scale data center or cloud networks (ECMP, Clos/leaf–spine).
• Experience with high-density switching at cloud or hyperscale environments is highly preferred.
• Familiarity with InfiniBand or RoCE is a significant advantage.
• Strong grasp of CPU/GPU server architecture, NICs, Linux, system software, and kernel drivers.
• Excellent time management skills and the ability to switch contexts across multiple customers and projects.
• Outstanding written and verbal communication skills, including the ability to produce clear design documentation, customer presentations, and root-cause analyses.
• Advanced certifications such as CCIE, JNCIE, or equivalent expert-level credentials.
• Experience in automation and tooling, including Python, Bash, or C/C++ for automating network workflows, validation, and debugging.
• Hands-on experience with NVIDIA platforms, including GPUs, NICs, DPUs, or ARM-based CPU platforms.
• Customer-facing experience in pre-sales, post-sales, field engineering, or consulting roles with external enterprise or cloud customers.
• Direct experience in large-scale deployments, including setting up and managing extensive clusters or supercomputing environments.
• Familiarity with virtualization, containers, and cloud networking concepts.
• Equity
• Benefits
LexisNexis
Futures
Hunt St
CRC Insurance Services
Get handpicked remote jobs straight to your inbox weekly.