Remotery

Senior Cloud Platform Engineer

Posted 1 day ago

πŸ“‹ Description

β€’ Develop platform infrastructure – Create and implement self-service tools that empower product teams to deploy services without needing infrastructure tickets or manual provisioning.

β€’ Minimize operational toil – Recognize repetitive manual tasks and develop automation solutions to eradicate them.

β€’ Enhance visibility and observability – Establish monitoring, alerting, and dashboards that provide teams with reassurance regarding the health of their services. Design systems that identify issues before users notice them and simplify debugging of production problems, regardless of whether it’s 3pm or 3am.

β€’ Engage in on-call rotation – Participate in the on-call rotation to address infrastructure incidents. Your efforts will focus on decreasing incident frequency through improved automation and resilience strategies.

β€’ Scale infrastructure – Strategize capacity, enhance performance, and ensure our platform manages increasing traffic without performance loss. You will tackle challenges such as minimizing deployment times, optimizing resource utilization, and maintaining sub-100ms p99 latencies.

β€’ Collaborate with various teams – Work closely with security, product engineering, and SRE teams to comprehend their needs and develop solutions that accommodate everyone.


⛳️ Requirements

β€’ 5+ years of experience with distributed systems and microservices in production settings.

β€’ Strong expertise in AWS – Proficient with EC2, ECS/EKS, VPC networking, IAM, and capable of architecting resilient systems across multiple availability zones.

β€’ Proficient in Infrastructure as Code – Daily experience with Terraform or CloudFormation, thinking in code rather than through graphical interfaces.

β€’ Programming capabilities for automation – Skilled in writing Go, Python, or similar languages to create tools and automate processes.

β€’ Production experience with Kubernetes multi-tenancy – You have deployed, scaled, and troubleshot containerized workloads in production clusters with multiple tenants.

β€’ Expertise in observability – Practical experience with tools such as Prometheus, Grafana, Datadog, or similar. You understand what to monitor and how to set effective alerts.

β€’ Incident response experience – You have participated in on-call duties, resolved outages, and authored postmortems that led to systemic enhancements.

β€’ Security-focused mindset – You adhere to least-privilege principles, ensure encryption both at rest and in transit, and consider threat models.


🏝️ Benefits

β€’ Health, dental, 401k, and numerous other benefits.

β€’ Generous paid time off.

β€’ Equity grant.

People also viewed

VALCE Talent Solutions17 hours ago

Oracle Cloud Architect

MX flagMexico OnlyFull-timeCloud Engineer
ApplyView job
DXC Technology17 hours ago

Cloud Architect

US flagFlorida OnlyFull-timeCloud Engineer
ApplyView job
Tech Minds Agency17 hours ago

Salesforce Health Cloud Developer – Freelance

IN flagIndia OnlyFreelanceCloud Engineer
ApplyView job
BTS17 hours ago

Mid Level Cloud Engineer

US flagCalifornia, +3 more statesFull-timeCloud Engineer$180k – $210k/year
ApplyView job
Kyndryl17 hours ago

Cloud Architect – GCP/AWS

IN flagIndia OnlyFull-timeCloud Engineer
ApplyView job
DMI (Digital Management, LLC)22 hours ago

Cloud Engineer, Mid-level

US flagUnited States OnlyFull-timeCloud Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers