Remotery

Senior Solutions Architect – Infiniband, Networking, Ethernet

Posted 6 days ago

This is a fully remote position, open to applicants in India.

📋 Description

• The main responsibilities will involve constructing AI/HPC infrastructure for both new and existing clients.

• Assist in the operational and reliability aspects of large-scale AI clusters, emphasizing performance at scale, real-time monitoring, logging, and alerting.

• Participate in and enhance the entire service lifecycle—from concept and design through to deployment, operation, and continuous improvement.

• Manage services post-launch by tracking and assessing availability, latency, and overall system health.

• Offer insights to internal teams, such as reporting bugs, documenting workarounds, and recommending enhancements.


⛳️ Requirements

• BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related disciplines.

• A minimum of 5+ years of professional experience in networking fundamentals, including Ethernet or InfiniBand.

• Practical experience with network switch/router platforms such as Cumulus Linux, SONiC, IOS, JunosOS, and EOS, among others.

• Solid understanding of core principles related to Ethernet, InfiniBand, and RDMA.

• Proficient in end-to-end IB/Eth cluster deployment, adapter configuration, and firmware maintenance, with the ability to perform professional performance benchmarking using mainstream RDMA testing tools.

• Capable of independently diagnosing and resolving common IB/Eth network issues, including link flapping, connection failures, and bandwidth and latency jitter problems.

• Mastery of practical RDMA network optimization techniques such as QP tuning, MTU configuration, and congestion control optimization.

• Hands-on experience in RDMA-accelerated business scenarios, including distributed storage and high-performance computing clusters.

• Extensive experience in delivering automated network provisioning solutions using tools like Ansible, Salt, and Python.

• Capability to develop CI/CD pipelines for network operations.

• Strong written, verbal, and listening skills in English are essential.


🏝️ Benefits

• NVIDIA has been at the forefront of accelerated computing.

• Our AI infrastructure drives global intelligence, revolutionizing every industry.

People also viewed

NVIDIA10 hours ago

Senior Solutions Architect, Customer Success

AE flagUnited Arab Emirates (UAE) OnlyFull-timeSolutions Engineer
ApplyView job
phData10 hours ago

Solutions Architect

Latin AmericaFull-timeSolutions Engineer
ApplyView job
Towa Software11 hours ago

AI Solutions Engineer – Document Intelligence, Generative AI

MX flagMexico OnlyFull-timeSolutions Engineer
ApplyView job
AIM Qualifications and Assessment Group1 day ago

SAP S/4HANA Solution Architect

PK flagPakistan OnlyFull-timeSolutions Engineer
ApplyView job
Aras Corporation1 day ago

Solution Architect

DE flagGermany OnlyFull-timeSolutions Engineer€50k/year
ApplyView job
Sensiba LLP1 day ago

Business Systems Solutions Manager

AU flagAustralia OnlyFull-timeSolutions Engineer$123.1k – $153.8k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers