This is a fully remote position, open to applicants in Brazil.

📋 Description

• Spearhead the design, execution, and continuous enhancement of reliable, scalable, high-performance, and secure production platforms and services.

• Collaborate closely with cross-functional teams to establish and maintain resilient infrastructure and deployment strategies.

• Offer technical guidance and mentorship to engineers throughout the organization, fostering robust engineering standards and operational best practices.

• Engage in a 24x7 on-call rotation to support critical services and ensure platform uptime.

• Promote standardization, automation, and documentation to enhance consistency, minimize operational burdens, and facilitate knowledge sharing.

• Contribute throughout the entire lifecycle of platform and service delivery, from design and construction to operation and optimization.

⛳️ Requirements

• Over 5 years of experience in DevOps, Site Reliability Engineering (SRE), platform engineering, or software engineering positions.

• Extensive experience with Kubernetes at scale, possessing a thorough understanding of containers and container orchestration.

• Practical experience with infrastructure as code tools such as Terraform, Ansible, or Puppet.

• Proficient programming skills in at least one object-oriented language, complemented by effective scripting and automation capabilities.

• Strong grasp of security principles and best practices across infrastructure, platforms, and services.

• Significant hands-on experience with at least one major cloud platform, with extensive exposure to AWS, GCP, or OCI.

• Robust monitoring, alerting, and observability experience using tools like Prometheus, Grafana, or similar platforms.

• Comprehensive understanding of networking fundamentals and distributed systems.

• Strong experience in Linux and/or Windows systems administration.

• Familiarity with software delivery automation, CI/CD pipelines, and secure software development lifecycle (SDLC) practices, including exposure to static and dynamic security testing.

• Good understanding of SRE concepts such as Service Level Indicators (SLIs), Service Level Objectives (SLOs), Service Level Agreements (SLAs), toil reduction, availability, and observability.

• Experience in managing and scaling Elasticsearch in production environments is highly desirable.

🏝️ Benefits

• Health insurance

• Professional development opportunities

• Flexible work arrangements

Staff DevOps Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior DevOps Engineer

Kubernetes Site Reliability Engineer

DevOps confirmé

DevOps Engineer, Cloud

DevOps Engineer – Part-Time

Mid Cloud Product Reliability Engineer

Never miss a great job!