Remotery

DevOps Engineer – HPC & GPU Platform

Posted May 20

This is a fully remote position, open to applicants in France.

📋 Description

• Create GPU benchmarking frameworks on AWS: orchestrating benchmark executions, gathering and storing results, facilitating performance comparisons across different versions.

• Develop tools for correctness validation: automating the testing of numerical accuracy of GPU compute outputs against established reference results.

• Execute distributed observability across all platform services: employing structured logging, distributed tracing (Pulsar), and performance metrics.

• Collaborate on wider HPC coding projects alongside the engineering team.


⛳️ Requirements

• Proficient Python or Go developer — capable of writing real application code, rather than just scripts.

• Familiarity with observability tools (Prometheus, Grafana, distributed tracing).

• Comfortable working with AWS (EC2, IAM, VPC) and CI/CD pipelines.

• Experience in HPC or GPU environments is highly advantageous — familiarity with Slurm, compute clusters, and GPU workloads.

• Preferred educational background includes ENSIMAG, Centrale, INSA, X, or equivalent engineering qualifications.

• Fluent in English — the team operates across France and the UK.


🏝️ Benefits

• Fully remote work, with 1 day per week in London.

People also viewed

Advanced Solutions International, Inc.10 hours ago

DevOps Reliability Engineer

AU flagAustralia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$90k – $110k/year
ApplyView job
Stone10 hours ago

Senior Site Reliability Engineer – Network

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Replit1 day ago

Staff Site Reliability Engineer

EuropeFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Soum1 day ago

DevOps Engineer, Mid Level

EG flagEgypt OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Lakeside Software1 day ago

DevOps Engineer, Azure

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Interval Group1 day ago

DevOps Engineer, mk8s

DE flagGermany OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers