
Senior Cloud - Kubernetes SRE
Posted 9 hours ago

Posted 9 hours ago
This is a fully remote position, open to applicants in United Kingdom.
• Manage, enhance, and scale production OKD / Kubernetes clusters in both on-premises and hybrid settings.
• Assist in the transition from VMware to KVM, contributing to the modernization of the underlying compute and storage infrastructure.
• Take ownership of and refine CI/CD processes throughout the entire lifecycle of platform and application components.
• Collaborate with platform and application engineers to facilitate cloud-native delivery using tools like Helm and Kustomize.
• Develop and advance GitOps deployment practices utilizing tools such as Argo CD or Flux.
• Oversee and enhance core platform services, including identity management, ingress, observability, certificate management, service mesh, and container registry capabilities.
• Construct and maintain observability frameworks encompassing logs, metrics, traces, alerting, SLOs, and error budgets.
• Strengthen platform security in accordance with secure and regulated environment standards, addressing network policy, SELinux, image provenance, secret management, and auditing.
• Automate repetitive operational tasks through tools such as Ansible, Terraform, Helm, Kustomize, Go, Python, or similar technologies.
• Spearhead incident response efforts, support blameless post-mortems, and advocate for systemic solutions.
• Collaborate with networking and security teams on platform integration, segmentation, load balancing, and accreditation documentation.
• Produce and maintain comprehensive technical documentation, runbooks, design notes, and operational guidance.
• Mentor fellow engineers and serve as a senior technical authority in cloud and Kubernetes operations.
• Participate in an on-call rotation, with appropriate compensation.
• Extensive experience managing production Kubernetes environments, not merely deploying into them.
• Strong foundational knowledge of Linux, including systemd, networking, storage, and performance troubleshooting.
• Familiarity with at least one Kubernetes distribution such as OKD, OpenShift, vanilla Kubernetes, Rancher, EKS, AKS, or GKE.
• Experience with infrastructure as code and automation tools, such as Ansible, Terraform, Helm, or Kustomize.
• Proficient in using GitOps tools like Argo CD or Flux in production settings.
• Proven experience in building or managing CI/CD pipelines for platform or application components.
• Strong expertise in observability, covering logs, metrics, and traces, with tools like Prometheus, Grafana, Elastic Stack, OpenTelemetry, or similar.
• Experience with identity and access management technologies, including OIDC, SAML, SCIM, or Keycloak.
• Background in virtualization or infrastructure platforms such as KVM, libvirt, or VMware.
• Proficiency in scripting or tooling with Go, Python, shell scripting, or similar languages.
• Excellent troubleshooting, problem-solving, and analytical abilities.
• Experience within secure, regulated, or enterprise-scale environments.
• Strong communication skills, capable of producing clear documentation, runbooks, post-mortems, and technical guidance.
• Eligibility to obtain UK SC clearance.
• Specific experience with OpenShift or OKD, including operators, MachineConfig, or SCCs. (Desirable)
• Familiarity with service mesh technologies such as Istio or Linkerd. (Desirable)
• Knowledge of policy engines like OPA, Gatekeeper, or Kyverno. (Desirable)
• Experience in cloud-native application deployment using Helm, Terraform, Kustomize, or similar tools. (Desirable)
• GitOps and CI/CD experience managing complete application and component lifecycles. (Desirable)
• Background in storage solutions like Ceph, Longhorn, OpenShift Data Foundation, or equivalent. (Desirable)
• Networking experience with technologies such as BGP, VXLAN, Palo Alto, or Juniper. (Desirable)
• Familiarity with software supply chain security, including SBOMs, image signing, admission control, or tools like Sigstore. (Desirable)
• Experience in operating AI, ML, or GPU-enabled platforms. (Desirable)
• CKA, CKAD, CKS, Red Hat certifications, or equivalent qualifications. (Desirable)
• Active or recent UK SC clearance. (Desirable)
• Recognized contributions to the Kubernetes open-source community. (Desirable)
• Private Medical
• Health Cash Plan
• 4x Life Assurance
• Inclusive Culture: Enjoy an inclusive culture and environment.
• Holiday: Generous holiday allowance.
• Learning: Access to continuous learning and development opportunities.
• Bonus Potential: Bonus potential based on performance and business-related factors.
• Discounts: Discounts on a wide range of products and services.
• Pension: Pension scheme contributions.
• EV Car Scheme
• Regular Pay Reviews
• More Benefits: Explore additional benefits on our career site.
Software Mind
Cherokee Federal
Avaya
Agilent Technologies
Get handpicked remote jobs straight to your inbox weekly.