Remotery

Senior Cloud Engineer – Azure/OpenShift

Posted 1 hour ago

This is a fully remote position, open to applicants in United States.

📋 Description

• Oversee the support and operation of cloud infrastructure, platform services, identity, networking, security measures, and operational tools across customer environments.

• Capable of designing and leading the deployment of moderately complex cloud solutions.

• Possesses an understanding of the performance, scaling, and functional traits of software technologies.

• Able to comprehend open-source and cloud use-cases, recommending standard design patterns typically employed in such solutions (best practices).

• Manage complex incidents, escalations, and problem investigations; conduct advanced troubleshooting, coordination, service restoration, and ensure durable resolution.

• Plan and implement complex changes and recurring operational tasks, including provisioning, access modifications, maintenance events, backup and recovery validation, patching coordination, and platform hygiene.

• Act as a senior escalation point within the on-call rotation for major incidents, high-impact issues, and customer-approved after-hours change activities.

• Adhere to and reinforce established ITSM processes for incident, request, change, problem, escalation, documentation, and customer-facing status communication.

• Create and maintain runbooks, SOPs, standards, knowledge articles, and technical documentation that enhance consistency and service quality.

• Mentor fellow Cloud Engineers, review their work for quality and completeness, and provide technical guidance on operational best practices.

• Drive improvements in monitoring, alerting, logging, tagging, policy, compliance, and cost visibility that enhance managed cloud operations.

• Utilize scripting, automation, and AI to decrease repetitive tasks, enhance consistency, and scale service delivery.

• General familiarity with DevOps/SRE tooling is necessary but is not the main focus of the role.

• Participate in customer meetings, service reviews, and advisory discussions; translate technical issues, risks, and opportunities for improvement into clear business communication.

• Operate and support Red Hat OpenShift (Kubernetes) clusters in production, ensuring cluster health, conducting upgrades, scaling, and managing the lifecycle.

• Manage OpenShift access and security controls, including RBAC, SCCs, NetworkPolicies, secrets management, and considerations for certificates/ingress.

• Troubleshoot platform and workload issues across Kubernetes/OpenShift constructs (nodes, operators, routes/ingress, services, deployments, pods, persistent volumes) and coordinate remediation with application, network, and security teams.

• Implement and validate platform backup, restore, and disaster recovery procedures (e.g., etcd, cluster resources, and persistent data) in line with customer requirements.

• Support platform automation and standardization efforts utilizing infrastructure as code and GitOps practices (e.g., Terraform, Ansible, Helm, Argo CD) to enhance repeatability and mitigate operational risk.

• Define and enhance observability for cloud and OpenShift platforms (metrics, logs, traces), adjust alerting to minimize noise, and contribute to availability, performance, and capacity planning.

• Other job responsibilities as assigned.


⛳️ Requirements

• 5+ years of experience in customer-focused IT infrastructure, cloud operations, systems administration, or managed services support, including work in production settings.

• Strong operational proficiency in at least one major cloud platform, with the capability to lead complex support and administration activities in Azure.

• Experience with other cloud platforms such as GCP, AWS, and OCI is highly preferred.

• At least 3+ years of experience supporting a production OpenShift environment (on-premises, ROSA, ARO, etc.).

• Proven experience leading complex incidents, escalations, change execution, and problem investigations within production environments.

• Familiarity with Windows and/or Linux server operations, networking fundamentals, identity and access management, monitoring, governance, and operational documentation.

• Experience in a managed services, consulting, or multi-customer support environment, ideally with complex enterprise customers (preferred).

• Strong working knowledge of PowerShell, Python, Bash, infrastructure as code, automation, CI/CD, or related platform tools used to enhance cloud operations (preferred).

• Relevant advanced cloud, operations, or platform certifications are an advantage (preferred).


🏝️ Benefits

• Medical, Dental, and Vision Insurance

• 401(k)

• Paid company holidays

• Paid time off

• Paid parental and caregiver leave

• Plus more! See benefits https://www.aheadbenefits.com/ for additional details.

People also viewed

Presidio1 hour ago

Senior Engineer, Modern Platforms, Cloud Infrastructure

US flagUnited States OnlyFull-timeCloud Engineer
ApplyView job
Duck Creek Technologies1 hour ago

Cloud Engineer I

US flagMassachusetts, +1 more stateFull-timeCloud Engineer$95k – $135.9k/year
ApplyView job
Bamboo Health11 hours ago

Cloud Engineer

US flagUnited States OnlyFull-timeCloud Engineer
ApplyView job
Volantsoft Inc11 hours ago

Senior AWS Connect Developer

US flagAlabama, +45 more statesFull-timeCloud Engineer
ApplyView job
Ensunet Technology Group11 hours ago

Senior Oracle Utilities Cloud Architect

US flagCalifornia OnlyFull-timeCloud Engineer$78 – $81/hour
ApplyView job
Workiy Inc.11 hours ago

Senior Engineer – React, Python, AWS – AI & Generative AI Solutions

CA flagCanada OnlyFreelanceCloud Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers