Remotery

DBRE – Database Reliability Engineer

Posted May 22

This is a fully remote position, open to applicants in Brazil.

📋 Description

• Oversee, manage, and enhance database environments in both production and non-production settings utilizing Infrastructure as Code (IaC).

• Guarantee high availability, performance, scalability, and dependability of databases.

• Strategize and implement backup, restoration, replication, and disaster recovery plans.

• Conduct tuning, capacity planning, and troubleshooting for databases and data integrations.

• Address incidents, perform Root Cause Analysis (RCA), and apply preventive measures.

• Develop and update runbooks and documentation of standards.

• Ensure effective, secure, and resilient connectivity between applications and databases hosted on AWS.

• Plan, manage, and optimize costs for AWS network resources (VPCs, Subnets, Route Tables, NAT Gateway, VPC Peering, VPC Endpoints, Transit Gateway, VPN, etc.).

• Actively engage in diagnosing latency, packet loss, timeouts, and connection issues.

• Collaborate with IAM and access policies for resources.

• Define and track SLIs and SLOs for databases and establish standards for monitoring and observability.

• Utilize tools such as Datadog, Prometheus, Grafana, and CloudWatch.

• Support strategies centered around reliability-oriented observability.


⛳️ Requirements

• Prior experience as a DBRE in high-stakes production environments.

• Strong understanding of implementation, scaling, tuning, and disaster recovery for at least two of the following databases: PostgreSQL, Elasticsearch, SOLR, MongoDB, Oracle, and Snowflake.

• Extensive experience with AWS, including networking, connectivity, monitoring, and observability services.

• Practical knowledge of Terraform (infrastructure as code for databases and networks).

• In-depth understanding of networking concepts: TCP/IP, DNS, latency, throughput, and timeouts.

• Scripting/automation proficiency in Shell, Python, or similar languages.

• Experience in database monitoring, metrics, and alerting.

• Background in SRE/DevOps environments with incident response experience in production.

• Strong analytical skills for complex troubleshooting involving databases, networks, and applications.

• Familiarity with EKS/Kubernetes for integrating applications and databases.


🏝️ Benefits

• Meal and Food Allowance.

• Gympass/TotalPass.

• Home-office allowance.

• Health Insurance and Dental Plan (dental optional).

• Childcare assistance (up to the child’s 6th birthday).

• Extended Maternity, Paternity, and Adoptive Leave (#allfamiliesmatter).

• Life Insurance.

• Birthday Day Off (one day off to be taken on the birthday or during the birthday month).

• Family Day (one day off for parents to be taken between May and August).

• Mental Break (one continuous week off in December to rest and recharge).

People also viewed

Advanced Solutions International, Inc.11 hours ago

DevOps Reliability Engineer

AU flagAustralia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)$90k – $110k/year
ApplyView job
Stone11 hours ago

Senior Site Reliability Engineer – Network

BR flagBrazil OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Replit1 day ago

Staff Site Reliability Engineer

EuropeFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Soum1 day ago

DevOps Engineer, Mid Level

EG flagEgypt OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Lakeside Software1 day ago

DevOps Engineer, Azure

IN flagIndia OnlyFull-timeDevOps & Site Reliability Engineer (SRE)
ApplyView job
Interval Group1 day ago

DevOps Engineer, mk8s

DE flagGermany OnlyFreelanceDevOps & Site Reliability Engineer (SRE)
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers