
Senior DevOps Engineer – Cloud, ML Infrastructure
Posted May 21

Posted May 21
This is a fully remote position, open to applicants in Greece.
• Design, manage, and enhance Kpler’s cloud-native infrastructure (Kubernetes, networking, compute, storage).
• Contribute to Infrastructure as Code, CI/CD pipelines, and automation of the platform.
• Ensure the high availability, reliability, and security of production systems.
• Enhance observability, monitoring, alerting, and incident response processes.
• Minimize MTTR and failure rates through systematic reliability enhancements.
• Optimize the cost and performance of infrastructure, particularly for compute-intensive workloads.
• Support and assist in standardizing ML/GPU-based workloads within the current platform framework.
• Work closely with ML engineers, data engineers, and backend teams to facilitate production-grade deployments.
• Influence architectural decisions that guide the platform's evolution.
• Over 5 years of experience in cloud/platform engineering within production settings.
• Extensive hands-on experience with Kubernetes in a production environment.
• Familiarity with Infrastructure as Code (preferably Terraform).
• Strong knowledge of AWS or an equivalent cloud service provider.
• Experience in managing distributed systems in 24/7 operational environments.
• Robust operational mindset (SLOs, monitoring, incident management).
• Bachelor’s or Master’s degree in Computer Science, Engineering, or comparable practical experience.
• Proficient programming skills (Python or Go preferred).
• Comprehensive understanding of cloud-native architecture and principles of reliability engineering.
• Competitive salary and performance-based bonuses.
• Flexible work hours and remote work opportunities.
• Comprehensive health and wellness benefits.
• Professional development and training programs.
• Collaborative and innovative team environment.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.