Remotery

Senior Solutions Architect, Cloud Infrastructure, DevOps

Posted Jun 20

This is a fully remote position, open to applicants in Japan.

📋 Description

• Oversee large-scale HPC/AI clusters with effective monitoring, logging, and alerting systems.

• Administer Linux job/workload schedulers and orchestration tools.

• Design and uphold continuous integration and delivery pipelines.

• Create tools to streamline the deployment and management of extensive infrastructure environments, automate operational monitoring and alerting, and facilitate self-service resource consumption.

• Implement monitoring solutions for servers, networks, and storage systems.

• Conduct troubleshooting from the ground up, addressing bare metal, operating system, software stack, and application levels.

• As a technical expert, develop, refine, and document standardized methodologies to share with internal teams.

• Assist in Research & Development efforts and participate in POCs/POVs for future enhancements.


⛳️ Requirements

• BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related disciplines.

• A minimum of 8 years of professional experience in networking principles, TCP/IP stack, and data center architecture.

• Familiarity with HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and related software.

• Comprehensive knowledge and practical experience with Kubernetes, focusing on container orchestration for AI/ML workloads, resource scheduling, scaling, and integration with HPC environments.

• Experience in managing and setting up HPC clusters, covering aspects of deployment, optimization, and troubleshooting.

• Proficient in job scheduling workloads and orchestration technologies such as Slurm, Kubernetes, and Singularity.

• Strong understanding of Windows and Linux systems (Redhat/CentOS and Ubuntu), including internals, ACLs, OS-level security measures, and common protocols such as TCP, DHCP, DNS, etc.

• Experience with various storage solutions, including Lustre, GPFS, ZFS, and XFS.

• Familiarity with new and emerging storage technologies is advantageous.

• Proficient in Python programming and bash scripting.

• Understanding of CI/CD pipelines for software deployment and automation.

• Comfortable using automation and configuration management tools such as Jenkins, Ansible, Puppet/Chef, etc.

• Ability to convey technical concepts and work collaboratively with Japanese-speaking clients.


🏝️ Benefits

• Opportunities for professional development.

• Flexible work arrangements.

People also viewed

Quandary Consulting Group9 hours ago

Senior Solutions Consultant

US flagUnited States OnlyFull-timeSolutions Engineer$80k – $140k/year
ApplyView job
Effective People9 hours ago

Senior Solution Architect, SAP SuccessFactors

DK flagDenmark OnlyFull-timeSolutions Engineer
ApplyView job
Presidio9 hours ago

Senior Solutions Architect, Datacenter, Cloud

US flagIllinois, +1 more stateFull-timeSolutions Engineer
ApplyView job
Luminovo9 hours ago

Solutions Engineer – US Austin

US flagTexas OnlyFull-timeSolutions Engineer$82k – $126.6k/year
ApplyView job
Matillion9 hours ago

Senior Partner Solution Architect

US flagCalifornia, +2 more statesFull-timeSolutions Engineer$125.3k – $187.9k/year
ApplyView job
D-Wave9 hours ago

Principal Process Integration Engineer

US flagUnited States OnlyFull-timeSolutions Engineer$121.2k – $181.8k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers