Remotery

Site Reliability Engineer

Posted 11 hours ago

This is a fully remote position, open to applicants in Poland.

📋 Description

• Ensure the dependability, scalability, and efficiency of Dropbox's infrastructure and services.

• Collaborate with interdisciplinary teams to establish and uphold best practices for monitoring, logging, and incident response.

• Construct, implement, and sustain automation and infrastructure-as-code tools, specifically Terraform, Ansible, and GitHub Actions, along with custom code platforms.

• Leverage container orchestration platforms, including Kubernetes, Amazon ECS, and Red Hat OpenShift, to manage containers on a large scale.

• Oversee and enhance monitoring and logging pipelines utilizing tools such as Datadog and Cribl LogStream.

• Propel improvement initiatives concerning service health and visibility for our stakeholders, which range from developers to business service owners to C-level executives.

• Create and maintain custom tools and automation scripts using Bash, Python, and other scripting languages.


⛳️ Requirements

• 5+ years of experience in site reliability engineering or similar engineering roles with practical coding experience.

• Strong understanding of AWS services, including EC2, S3, RDS, R53, Lambda, and others.

• In-depth knowledge of Linux administration, internals, filesystems, volume management, and specific distributions such as Ubuntu and RHEL, along with DNS and DHCP.

• Familiarity with monitoring and logging tools, including Datadog and logging pipeline tools like Vector or Cribl LogStream.

• Proven experience in leading one or more transformational programs related to metrics and observability.

• Proficient in scripting with a higher-level language (Python preferred).

• Experience in developing automation to address infrastructure-related tasks using tools such as Chef, Ansible, or Terraform.

• Knowledge of log analysis and the creation of metrics, alerts, and visuals from log data.

• Strong expertise in infrastructure-as-code tools, particularly Terraform.

• Solid proficiency in configuration management tools, specifically Ansible Automation Platform and Chef.

• Familiarity with containerization technologies, such as Docker, and container orchestration platforms like Kubernetes or Amazon ECS.

• Understanding of LDAP, REST APIs, and current authentication methods.

• Familiarity with GitHub and Git-based workflows.

• Knowledge of RDS databases and network security technologies, including WAF.

• Excellent problem-solving abilities and the capacity to thrive in a fast-paced, collaborative environment.

• Outstanding written and verbal communication skills.


🏝️ Benefits

• Health insurance.

• Retirement plans.

• Paid time off.

• Flexible work arrangements.

• Professional development.

People also viewed

Innovative Solutions45 min ago

Cloud Engineer – DevOps

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$100k – $160k/year
ApplyView job
Caspar Health45 min ago

DevSecOps/DevOps Engineer

DE flagGermany OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
IVIX45 min ago

Deployment Engineer

US flagNew York OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Investigo11 hours ago

Senior Cloud - Kubernetes SRE

GB flagUnited Kingdom OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Software Mind11 hours ago

DevOps Engineer

AR flagArgentina OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Cherokee Federal11 hours ago

DevSecOps Engineer

US flagUnited States OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$125k – $140k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers