
Senior Database Administrator
Posted Jun 23

Posted Jun 23
This is a fully remote position, open to applicants in Florida.
• Manage and enhance multi-region CockroachDB clusters and PostgreSQL instances across production, staging, and development environments.
• Analyze contention issues, serialization retries (SQLSTATE 40001), and range hotspots impacting transaction paths for bets, wagers, settlements, and payouts.
• Diagnose leaseholder placement, monotonic-key write pressure, and cross-region latency that convert a single query into multiple network round trips.
• Trace slow queries back to their execution plans, analyze the plans for optimizer behavior, and link the behavior to the schema or application call paths responsible.
• Identify plan regressions following changes in statistics, schema, data growth, or releases, and develop safer rollout patterns for fixes.
• Evaluate schema-change and backfill risks before they affect large tables, considering lock behavior, retry pressure, capacity impact, and rollback paths.
• Plan for capacity, scaling events, and version upgrades in a manner that minimizes customer impact.
• Develop database observability using Prometheus, Grafana, Mimir, Loki, and Snowflake, including dashboards based on real queries, alerting on leading indicators, and tracking SLOs.
• Correlate database symptoms with application behavior and business events, ensuring alerts target the root causes.
• Participate in the on-call rotation for the data layer, lead incident analysis when the database contributes to failures, and guide blameless postmortems towards lasting solutions.
• Equip the system to detect degradation before it escalates into incidents.
• Create Go and Python tools to simplify incident understanding, including log collectors, explain-plan analyzers, migration checks, capacity models, and runbook generators.
• Automate provisioning, configuration, backup, and recovery processes using Terraform and other infrastructure-as-code tools.
• Collaborate with platform engineering on CI/CD processes, deployment safety, and change management for services that interact with databases.
• Utilize tools like Claude Code, Codex, and similar frameworks in daily tasks to investigate incidents, generate probes, draft runbooks, and develop tooling.
• Ensure that the harness is based on logs, traces, metrics, and source code, and validate its outputs against production facts before deployment.
• Implement and review access controls, authentication, authorization, and encryption for data both in transit and at rest.
• Address data-protection, audit-logging, and retention requirements for a regulated gaming platform.
• Establish a database center of excellence, including documentation, standards, and reusable patterns for other teams to leverage.
• Mentor engineers in database fundamentals, distributed-systems behavior, and operational best practices.
• Study CockroachDB and PostgreSQL internals when a problem requires a code-level solution, sharing insights with the team.
• 5+ years of experience managing production relational databases in roles such as DBA, Database Reliability Engineer, or Data Platform Engineer.
• Extensive experience with PostgreSQL, CockroachDB, or another relational database management system, demonstrating fluency in the tradeoffs related to isolation, consensus, locality, and query planning.
• Profound understanding of SQL engine operations, including MVCC, the Volcano iterator execution model, and cost-based optimizer frameworks like Cascades.
• Application of this knowledge to indexing strategies and execution plan evaluations.
• Familiarity with distributed-systems principles including consensus (Raft and Paxos), distributed transactions, consistency models, and failure modes.
• Proficient in creating tools and automation using Python or Go, with the ability to debug and propose fixes for business-critical code written in Java or a similar language.
• Experience with infrastructure-as-code practices and a monitoring and observability stack such as Grafana or Datadog.
• UNIX/Linux system administration skills, shell scripting knowledge, comfort with cloud platforms (AWS preferred, Azure or GCP accepted), and readiness to participate in an on-call rotation.
• Strong technical communication skills, both written and verbal, as collaboration across teams is essential.
• Competitive compensation and benefits package.
• Flexible vacation policy.
• Hybrid and remote working options available.
• Startup culture supported by a stable global brand.
• Opportunity to develop products enjoyed by millions as part of a dedicated team.
• Team gatherings held across the US and Europe.
• Employer-sponsored training and attendance at conferences.
• Chance to work in an AI-first environment with access to essential tools for success.
AssetWorks Inc
Kitestring Technical Services
MariaDB
Partners Federal Credit Union
Get handpicked remote jobs straight to your inbox weekly.