
Site Reliability Engineer
Posted 11 hours ago

Posted 11 hours ago
This is a fully remote position, open to applicants in Poland.
• Ensure the dependability, scalability, and efficiency of Dropbox's infrastructure and services.
• Collaborate with interdisciplinary teams to establish and uphold best practices for monitoring, logging, and incident response.
• Construct, implement, and sustain automation and infrastructure-as-code tools, specifically Terraform, Ansible, and GitHub Actions, along with custom code platforms.
• Leverage container orchestration platforms, including Kubernetes, Amazon ECS, and Red Hat OpenShift, to manage containers on a large scale.
• Oversee and enhance monitoring and logging pipelines utilizing tools such as Datadog and Cribl LogStream.
• Propel improvement initiatives concerning service health and visibility for our stakeholders, which range from developers to business service owners to C-level executives.
• Create and maintain custom tools and automation scripts using Bash, Python, and other scripting languages.
• 5+ years of experience in site reliability engineering or similar engineering roles with practical coding experience.
• Strong understanding of AWS services, including EC2, S3, RDS, R53, Lambda, and others.
• In-depth knowledge of Linux administration, internals, filesystems, volume management, and specific distributions such as Ubuntu and RHEL, along with DNS and DHCP.
• Familiarity with monitoring and logging tools, including Datadog and logging pipeline tools like Vector or Cribl LogStream.
• Proven experience in leading one or more transformational programs related to metrics and observability.
• Proficient in scripting with a higher-level language (Python preferred).
• Experience in developing automation to address infrastructure-related tasks using tools such as Chef, Ansible, or Terraform.
• Knowledge of log analysis and the creation of metrics, alerts, and visuals from log data.
• Strong expertise in infrastructure-as-code tools, particularly Terraform.
• Solid proficiency in configuration management tools, specifically Ansible Automation Platform and Chef.
• Familiarity with containerization technologies, such as Docker, and container orchestration platforms like Kubernetes or Amazon ECS.
• Understanding of LDAP, REST APIs, and current authentication methods.
• Familiarity with GitHub and Git-based workflows.
• Knowledge of RDS databases and network security technologies, including WAF.
• Excellent problem-solving abilities and the capacity to thrive in a fast-paced, collaborative environment.
• Outstanding written and verbal communication skills.
• Health insurance.
• Retirement plans.
• Paid time off.
• Flexible work arrangements.
• Professional development.
Innovative Solutions
Caspar Health
IVIX
Investigo
Get handpicked remote jobs straight to your inbox weekly.