
AI Infrastructure & Platform Operations Engineer
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Poland.
• Oversee, manage, and provide support for production AI infrastructure platforms.
• Identify and resolve incidents related to infrastructure, networking, hardware, and platforms.
• Provide support for NVIDIA GPU infrastructure and related platform services.
• Monitor and troubleshoot environments based on Kubernetes.
• Examine performance, availability, and reliability concerns across infrastructure and platform components.
• Work collaboratively with engineering teams, hardware vendors, datacenter staff, and service delivery teams to address technical challenges.
• Engage in incident response, root cause analysis, and initiatives aimed at operational improvements.
• Contribute to enhancements in monitoring, observability, automation, and operational procedures.
• Keep operational documentation, runbooks, and knowledge articles up to date.
• A minimum of 3 years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or similar technical roles.
• Proficient in Linux administration and troubleshooting.
• Solid understanding of networking principles and experience in diagnosing infrastructure-related problems.
• Familiarity with Kubernetes in production settings.
• Experience in supporting production infrastructure and services.
• Strong analytical and problem-solving abilities.
• Experience adhering to structured operational and incident management processes.
• Exceptional communication and collaboration abilities.
• Willingness to work within a shift-based operational framework.
• Work with some of the most cutting-edge AI infrastructure environments currently in production.
• Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments.
• Contribute to defining the operational and support framework for next-generation AI infrastructure.
• Be part of a team that is shaping the future of AI-powered operations through k0rdent AI.
• Join an expanding organization that is heavily investing in AI infrastructure and platform services.
Attio
TechBiz Global
Get handpicked remote jobs straight to your inbox weekly.