Remotery

Senior / Staff DevOps Engineer

Posted 2 hours ago

This is a fully remote position, open to applicants in United States.

📋 Description

• Design, develop, and manage cloud infrastructure (preferably GCP) using infrastructure-as-code, focusing on repeatability, security, and cost-effectiveness.

• Take ownership of and continually enhance CI/CD pipelines. Automate integration and unit testing, provisioning, deployments, and rollbacks to ensure swift and secure delivery.

• Create and sustain observability across the platform, including monitoring, logging, tracing, alerting, and insightful dashboards that identify issues before they impact customers.

• Enhance and elevate our security measures: managing secrets, ensuring encryption during transit and at rest, implementing IAM and least-privilege access, network segmentation, and managing vulnerabilities across all infrastructure.

• Promote compliance readiness by collaborating with security and leadership to maintain, automate, and provide evidence for controls across frameworks such as SOC 2, ISO 27001, GDPR, (HIPAA is a plus), including audit support and ongoing control monitoring.

• Lead incident response efforts and participate in the on-call rotation; facilitate blameless postmortems, minimize mean-time-to-recovery, and transform lessons learned into durable solutions.

• Establish and maintain reliability objectives (SLOs/SLIs), conduct capacity planning, and optimize performance as we expand across different countries and industries.

• Utilize AI-powered tools (Claude Code, Cursor, GitHub Copilot, among others) to expedite infrastructure-as-code, automation, and internal tooling, while enhancing incident triage and response.

• Collaborate with engineering teams to enhance developer experience and deployment speed, eliminating friction and automating repetitive tasks.

• Foster a culture of operational excellence, reliability, security, and continuous improvement throughout the engineering organization.

• Set the technical roadmap for platform and infrastructure, mentoring engineers on best practices in DevOps, reliability, and security.

• Regularly assess and integrate emerging AI-powered tools and workflows to enhance infrastructure performance, quality, and security.


⛳️ Requirements

• Over 6 years of experience in DevOps, Site Reliability, Platform, or Infrastructure Engineering within a software engineering context.

• Profound expertise with a major cloud provider (GCP preferred) and a solid understanding of networking, security, and distributed systems.

• Extensive hands-on experience with infrastructure-as-code tools (Terraform, Pulumi, and/or CloudFormation) and configuration management.

• Proven production experience with containers and orchestration (Docker, Kubernetes, or ECS) as well as building robust CI/CD pipelines (GitHub Actions, CircleCI, or similar).

• Proficient in observability and monitoring tools (Datadog, Prometheus/Grafana, CloudWatch, or equivalent).

• Strong scripting and programming capabilities (Python, Go, Bash, or TypeScript/Node) for automation and tooling development.

• In-depth knowledge of cloud security best practices: IAM and least-privilege, secrets management, encryption, network security, and vulnerability management.

• Practical experience supporting compliance frameworks such as SOC 2, ISO 27001, GDPR, HIPAA, including control implementation, evidence and audit preparedness, and compliance automation.

• Familiarity with AI-powered development tools like Claude Code, Cursor, GitHub Copilot, or similar; proven ability to accelerate infrastructure and automation tasks using these tools.

• Experience in leading incident response and participating in an on-call rotation for production systems.

• Exceptional written and verbal communication skills; capable of clearly documenting systems and creating actionable runbooks.

• Experience in a startup environment is mandatory.

• Experience in managing or working collaboratively with distributed teams is crucial.

• Knowledge of identity verification products or AI/ML-based solutions is a plus.


🏝️ Benefits

• Equity compensation

• Remote working environment

• Self-managed paid time off

• 11+ annual company holidays

• 401(k)

• Health Care Benefits: Medical, Vision, Dental

• Wellness benefits: EAP, LifeHealth Online, One Medical, Perkspot

• Parental leave

People also viewed

HealthEdge40 min ago

Senior Release Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$108k – $115k/year
ApplyView job
Equinix1 hour ago

Senior Staff Engineer, SRE/DevOps, Produit Logiciel

US flagTexas OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$136k – $204k/year
ApplyView job
Calendly1 hour ago

Senior Site Reliability Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$198k – $288k/year
ApplyView job
GFT Technologies1 hour ago

DevOps Cloud Networking Engineer – English Advanced

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Hotel Engine1 hour ago

Senior Software Engineer, DevOps/Infrastructure

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$121.4k – $168k/year
ApplyView job
Solace1 hour ago

Senior Cloud Site Reliability Engineer

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers