
Systems Engineer, HPC
Posted May 22

Posted May 22
This is a fully remote position, open to applicants in France.
• Manage and sustain extensive Linux environments (bare metal, clusters, cloud).
• Oversee system health, resolve incidents, and ensure continuous availability.
• Provide support for production and research workloads across diverse environments.
• Assist in scaling clusters to accommodate hundreds to thousands of nodes.
• Work with systems that handle petabyte-scale storage solutions.
• Enhance performance, reliability, and resource utilization.
• Automate operational tasks using tools such as Python, Bash, Ansible, or Terraform.
• Optimize deployment, provisioning, and system lifecycle management processes.
• Participate in system design and architectural decisions.
• Act as a liaison between users and the infrastructure.
• Solid experience in Linux systems administration (essential requirement).
• Background in large-scale environments: HPC clusters or cloud infrastructures.
• Familiarity with job schedulers (e.g., Slurm).
• Strong troubleshooting abilities across systems, hardware, and networks.
• Knowledge of containers/orchestration (e.g., Kubernetes).
• Experience with storage systems (e.g., Ceph, Lustre, NFS).
• Understanding of networking fundamentals (Ethernet; InfiniBand is a bonus).
• Proficiency in Infrastructure as Code / automation tools.
• Experience with GPU or AI/ML technologies.
• Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure.
• Growth: Opportunity to shape data centre operations from the ground up in a high-growth startup environment.
• Collaboration: Work with a talented, cross-functional team passionate about AI and technology.
• Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.
harrison.ai
Pavilion
State of Rhode Island
Get handpicked remote jobs straight to your inbox weekly.