
Customer Site Reliability Engineer – OpenShift Managed Cloud Services, Spoken Japanese, Kubernetes/AWS/Azure, Linux
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Australia.
• Oversee large-scale, distributed systems with a focus on reducing downtime and enhancing system resilience.
• Uphold customer trust and confidence by ensuring the stability and functionality of services.
• Propel ongoing improvements in processes, tools, and methodologies to meet the evolving needs of the service.
• Spearhead the creation of code and automation scripts aimed at optimizing the scalability, reliability, and performance of services.
• Take charge of and engage in high-priority customer escalations, adopting a customer-centric approach.
• Organize and carry out complex incident response procedures, ensuring prompt resolution and comprehensive postmortems.
• Collaborate with cross-functional teams to bolster system robustness.
• Exhibit a proactive attitude to preempt escalations and ensure dependable operations.
• Record resolutions, root causes, and best practices to enhance the knowledge base and promote self-service solutions.
• Guide and mentor team members, nurturing a culture of continuous learning, knowledge sharing, and collaboration.
• Participate in the on-call rotation and provide leadership during critical incidents.
• Collaborate on strategic AI and automation initiatives aimed at improving the efficiency of fleet operations and troubleshooting, ultimately delivering an enhanced product experience for customers.
• Advanced experience with OpenShift/Kubernetes for container platform support or administration.
• Proficient in container-based technologies operating on Linux.
• Skilled in managing Linux-based systems within public cloud environments such as AWS, Azure, or GCP.
• Advanced experience with enterprise systems monitoring; knowledge of Prometheus is preferred.
• Advanced proficiency with enterprise configuration management tools such as Ansible and Terraform.
• Software engineering experience using object-oriented programming languages; golang is preferred.
• Excellent communication skills with experience in direct customer interaction and presentations.
• Ability to rapidly learn new technologies and stay updated with industry trends.
• Proven capability to quickly and accurately diagnose systems issues.
• Strong understanding of standard TCP/IP networking and common protocols.
• Proficient in English, with additional languages such as Japanese, Chinese, Korean, or Spanish being an advantage.
• Health insurance
• Flexible working hours
• Professional development opportunities
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.