
Platform Infrastructure Engineer – SRE Core
Posted 1 day ago

Posted 1 day ago
• Design, deploy, and maintain VM and Kubernetes infrastructure on GCP and AWS across numerous clusters that span development, staging, and production environments in various regions.
• Collaborate with colleagues within your team and across departments to ensure that the tasks you are focusing on address the problems we need to resolve.
• Create and sustain Infrastructure as Code (IaC) utilizing Terraform modules, managing resources through Spacelift or similar Terraform Automation and Collaboration Software (TACOS). Provision cloud infrastructure, which includes networking, compute, storage, and security components primarily on GCP, with additional support for AWS.
• Implement and oversee workflows with advanced multi-layer configuration management.
• Develop and maintain extensive observability solutions using Grafana Cloud, Prometheus/Mimir, and OTel collectors. Design Grafana dashboards, set up alerting rules, and ensure visibility across all platform components.
• Administer certificate lifecycle, DNS automation, ingress controllers, and service mesh networking with Cilium.
• Collaborate with Engineering, Product, Compliance, and Security teams to develop resilient, scalable systems. Provide consultation on capacity planning, disaster recovery, and architectural choices for cloud-native applications.
• Identify and reduce toil through automation. Create scripts, develop tools, and construct CI/CD pipelines to enhance operational efficiency and minimize manual tasks.
• Engage in a 24x7 on-call rotation as part of a globally distributed team, responding to incidents and facilitating post-incident reviews.
• Bachelor's degree in Computer Science, a related technical field of study, or equivalent practical experience.
• Proficiency in popular programming and scripting languages, with a strong focus on Python, Bash, and Go.
• Understanding of network topologies, communication protocols (e.g., TCP/IP, HTTP/S, UDP, TLS), and enterprise-grade connectivity solutions.
• Expertise in Kubernetes, including cluster administration, RBAC, networking, workload management, and troubleshooting within production environments.
• Demonstrated experience with Terraform for infrastructure provisioning and management.
• Familiarity with Google Cloud Platform services, such as GKE, VPC networking, Cloud DNS, Artifact Registry, Secret Manager, IAM, Gemini Code Assist, and Workload Identity.
• Experience with GitOps methodologies and tools.
• Collaborative, inclusive, and enjoyable culture
• Opportunities to take initiative
• Support for new ideas
• Open communication
Launch Potato
Xtremepush
BI2run
S + S Regeltechnik GmbH
Get handpicked remote jobs straight to your inbox weekly.