This is a fully remote position, open to applicants in Poland.

• Define and spearhead the infrastructure and reliability strategy across the platform.

• Collaborate with engineering teams to design scalable and resilient systems.

• Enhance build, testing, and deployment processes to improve speed and stability.

• Establish and maintain best practices for CI/CD, monitoring, and observability.

• Lead incident response efforts and promote continuous improvement following incidents.

• Automate workflows to minimize operational toil and mitigate risk.

• Mentor engineers and cultivate a culture of operational excellence.

• Make informed strategic decisions regarding build versus buy, weighing speed, quality, and sustainability.

• A minimum of 8 years of experience in Site Reliability Engineering or DevOps roles, with at least 2 years in a Principal or Lead capacity.

• Demonstrated experience in modernizing infrastructure and scaling initiatives within high-growth environments.

• Strong expertise in Python programming.

• In-depth knowledge of cloud platforms and container orchestration tools such as AWS ECS and EKS.

• Extensive experience in designing and optimizing CI/CD pipelines using tools like GitHub Actions and Buildkite.

• Proficiency in infrastructure-as-code tools such as Terraform.

• Strong understanding of monitoring, observability, and performance optimization practices.

• Upper-Intermediate proficiency in spoken and written English.

• Healthcare coverage.

• Flexible work arrangements.

• Opportunities for professional development.

Principal Site Reliability Engineer

People also viewed