Remotery

Senior Cloud Engineer – Azure/OpenShift

atThinkahead Consultant Psychologist Pty LtdUS flagUnited StatesFull-timeCloud EngineerSenior$140k – $160k/year

Posted 23 hours ago

📋 Description

• Oversee the support and operation of cloud infrastructure, platform services, identity management, networking, security measures, and operational tools across customer environments.

• Capable of designing and leading the deployment of moderately complex cloud solutions.

• Possesses an understanding of software technologies' performance, scalability, and functional characteristics.

• Able to comprehend open-source and cloud use cases, and suggest standard design patterns commonly utilized in these solutions (best practices).

• Take ownership of complex incidents, escalations, and problem investigations; execute advanced troubleshooting, coordination, service restoration, and ensure durable resolution.

• Plan and implement complex changes and routine operational tasks including provisioning, access alterations, maintenance events, backup and recovery validation, patching coordination, and platform hygiene.

• Act as a senior escalation point within the on-call rotation for major incidents, high-impact issues, and customer-approved after-hours change activities.

• Adhere to and promote established ITSM processes for incident, request, change, problem, escalation, documentation, and customer-facing status communication.

• Create and maintain runbooks, standard operating procedures (SOPs), standards, knowledge articles, and technical documentation that enhance consistency and service quality.

• Mentor other Cloud Engineers, review their work for quality and completeness, and provide technical guidance on operational best practices.

• Advocate for improvements in monitoring, alerting, logging, tagging, policy compliance, and cost visibility that enhance managed cloud operations.

• Utilize scripting, automation, and AI to minimize repetitive tasks, enhance consistency, and scale service delivery.

• General familiarity with DevOps/SRE tooling is required, though it is not the primary focus of the role.

• Engage in customer meetings, service reviews, and advisory discussions; articulate technical issues, risks, and improvement opportunities in a clear manner suitable for business communication.

• Operate and maintain Red Hat OpenShift (Kubernetes) clusters in production, including managing cluster health, upgrades, scaling, and lifecycle management.

• Oversee OpenShift access and security controls, including role-based access control (RBAC), security context constraints (SCCs), network policies, secrets management, and considerations for certificates and ingress.

• Diagnose platform and workload issues across Kubernetes/OpenShift constructs (nodes, operators, routes/ingress, services, deployments, pods, persistent volumes) and coordinate remediation with application, network, and security teams.

• Implement and validate platform backup, restore, and disaster recovery processes (e.g., etcd, cluster resources, and persistent data) in line with customer requirements.

• Support platform automation and standardization initiatives using infrastructure as code and GitOps practices (e.g., Terraform, Ansible, Helm, Argo CD) to enhance repeatability and mitigate operational risk.

• Define and enhance observability for cloud and OpenShift platforms (metrics, logs, traces), optimize alerting to reduce noise, and contribute to availability, performance, and capacity planning.

• Other job responsibilities as assigned.


⛳️ Requirements

• Minimum of 5 years in customer-facing IT infrastructure, cloud operations, systems administration, or managed services support within production environments.

• Strong operational proficiency in at least one major cloud platform, with the capacity to lead complex support and administrative tasks in Azure.

• Experience with other cloud platforms such as GCP, AWS, and OCI is highly preferred.

• At least 3 years of experience in supporting a production OpenShift environment (on-premises, ROSA, ARO, etc.).

• Proven experience in managing complex incidents, escalations, change execution, and problem investigations in production settings.

• Familiarity with Windows and/or Linux server operations, networking fundamentals, identity and access management, monitoring, governance, and operational documentation.

• Experience in managed services, consulting, or multi-customer support environments, preferably with complex enterprise customers (preferred).

• Strong working knowledge of PowerShell, Python, Bash, infrastructure as code, automation, CI/CD, or related tools that enhance cloud operations (preferred).

• Relevant advanced cloud, operations, or platform certifications are considered a plus (preferred).


🏝️ Benefits

• Medical, Dental, and Vision Insurance

• 401(k)

• Paid company holidays

• Paid time off

• Paid parental and caregiver leave

• Plus more! See benefits https://www.aheadbenefits.com/ for additional details.

People also viewed

BTS18 hours ago

Mid Level Cloud Engineer

US flagCalifornia, +3 more statesFull-timeCloud Engineer$180k – $210k/year
ApplyView job
DXC Technology18 hours ago

Cloud Architect

US flagFlorida OnlyFull-timeCloud Engineer
ApplyView job
Tech Minds Agency18 hours ago

Salesforce Health Cloud Developer – Freelance

IN flagIndia OnlyFreelanceCloud Engineer
ApplyView job
Kyndryl18 hours ago

Cloud Architect – GCP/AWS

IN flagIndia OnlyFull-timeCloud Engineer
ApplyView job
VALCE Talent Solutions18 hours ago

Oracle Cloud Architect

MX flagMexico OnlyFull-timeCloud Engineer
ApplyView job
DMI (Digital Management, LLC)23 hours ago

Cloud Engineer, Mid-level

US flagUnited States OnlyFull-timeCloud Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers