
Principal DevOps Engineer
Posted 11 hours ago

Posted 11 hours ago
This is a fully remote position, open to applicants in United Kingdom.
• Serve as the technical leader for DevOps/Platform/Release engineering: establish direction, standards, and best practices.
• Architect and oversee the end-to-end delivery process: infrastructure provisioning, configuration management, CI/CD, release processes, and operations.
• Design and support Windows-based high availability solutions, taking full ownership of Windows clustering (failover/HA patterns, maintenance, upgrades, troubleshooting).
• Lead automation and standardization initiatives for Linux platforms (configuration, patching, hardening, performance tuning).
• Manage the Infrastructure as Code strategy utilizing Terraform (modules, environments, state, governance).
• Direct the automation strategy with Ansible (reusable roles, inventories, secure secrets handling, idempotency).
• Construct and standardize deployments using Octopus Deploy, GitHub, and Ansible (templates, shared steps, release promotion, rollback).
• Design and enhance CI/CD pipelines (artifact versioning, approvals, promotion strategies, policy-as-code where applicable).
• Set observability standards using VictoriaMetrics/Prometheus (metrics strategy, alerting, SLO/SLA monitoring, dashboards).
• Provide leadership in production environments: incident response, RCA/postmortems, reliability improvements, and capacity planning.
• Mentor engineers, review designs/code, and elevate overall engineering quality across teams.
• Create and maintain architecture documentation, runbooks, and platform roadmaps.
• Bachelor's degree in Computer Science or a related field.
• 7+ years (or equivalent) experience in DevOps / SRE / Infrastructure Engineering, including leadership in complex environments.
• Expert-level proficiency in designing and managing Windows Server HA and clustering (Failover Clustering and related components).
• Strong background in Linux administration and automation (systemd, networking, storage, performance).
• Advanced knowledge of Terraform and Ansible (architecture, reusable components, secure operations).
• Significant experience in deployment/release engineering with Octopus Deploy and GitHub (release governance, environment promotion, rollback).
• Expertise in monitoring/observability with VictoriaMetrics and/or Prometheus (alerting strategy, metrics design, operational readiness).
• Production experience with Redis, RabbitMQ, Nginx (HA, tuning, troubleshooting).
• Solid understanding of networking and security fundamentals (TLS, DNS, load balancing, firewalling, least privilege).
• Proven capability to lead cross-team initiatives, make architectural decisions, and communicate effectively.
• Familiarity with Kubernetes and container ecosystems (Docker, Helm).
• *We hire, promote, and compensate employees based on their ability to perform their job responsibilities, without regard to race, color, creed, religion, sex, gender, marital status, national origin, ancestry, age, citizenship, physical or mental disability, sexual orientation, or any other basis protected by applicable law (collectively referred to in our Code of Conduct as “Protected Classes”). We do not tolerate employment discrimination in the workplace, and we are committed to making reasonable accommodations for identified disabilities or other limitations as required by all applicable laws. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.*
Cision France
Navigate Power
Get handpicked remote jobs straight to your inbox weekly.