
SRE SPC, CloudOps
Posted May 25

Posted May 25
This is a fully remote position, open to applicants in Mexico.
• Accountable for ensuring the reliability, performance, and operational excellence of applications deployed on Microsoft Azure.
• Expert management of incidents.
• Troubleshooting platform-related issues.
• Implement proactive measures for reliability enhancements.
• Automate operations to support scalable digital experiences.
• Understanding of Microsoft Azure Site Reliability Engineering concepts (SLOs, SLIs, error budgets).
• Experience with Azure Kubernetes Service.
• Proficiency in containerization and Kubernetes orchestration.
• Knowledge of monitoring and observability practices.
• Skills in operational automation.
• Advanced troubleshooting capabilities for cloud platforms.
• Management of distributed applications.
• Experience with high availability (HA) and scalability solutions.
• Proficient in advanced incident management.
• Skills in performance tuning and optimization.
• Soft skills: Ability to work under pressure, commitment to quality, and results-oriented approach.
• Competitive salary and benefits package.
• Collaborative workplace culture.
• Emphasis on building professional skills.
• Streamlined recruitment process and comprehensive onboarding program.
Advanced Solutions International, Inc.
Stone
Replit
Soum
Get handpicked remote jobs straight to your inbox weekly.