
Senior Database Site Reliability Engineer
Posted May 6

Posted May 6
• Tasked with designing, implementing, and maintaining high-availability, high-throughput, data and compute-intensive database systems utilizing PostgreSQL to support a continuously growing 24x7 SaaS platform.
• Define and enhance database service reliability through effective monitoring/alerting, SLO-oriented metrics, and ensuring operational readiness.
• Engage in and facilitate incident response, root cause analysis, and post-incident corrective measures for database-related production occurrences.
• Collaborate with other technical leaders to confirm that all newly introduced systems are both supportable and maintainable by development and operations teams.
• Provide advanced technical guidance and support to various technology teams across the organization.
• Offer on-call support for production issues and other responsibilities as necessary.
• Responsible for adhering to HIPAA security policies within the database framework.
• Ensure that all solutions and operational tasks comply with the security and operational policies set forth by the organization.
• Lead the continuous enhancement of our Datadog database observability by creating actionable dashboards, alerts, and service-level views using an observability stack (e.g., Prometheus, Grafana, New Relic, or similar). Familiarity with PGAnalyze or Percona is advantageous.
• Automate system maintenance tasks using Bash, Powershell, Python, or Ansible. Manage infrastructure as code (IaC) by writing Ansible playbooks; some experience with Terraform is a plus.
• Experience in writing and designing ETL pipelines using Python is a plus.
• Understand and maintain various PostgreSQL ecosystem components such as PgBouncer, PgBackrest, HaProxy, and RepMgr.
• Possess excellent communication and interpersonal skills.
• Bachelor’s degree in Information Systems, Engineering, or equivalent experience.
• 7-10+ years of engineering experience in Database Engineering, Systems Engineering, DevOps, or SRE.
• Familiarity with cloud-based compute, storage, and containerization solutions (preferably Azure & Kubernetes).
• Proficient in operating PostgreSQL within a Linux environment is a plus.
• Expertise in observability/monitoring platforms (e.g., Prometheus/Grafana, New Relic, Datadog, or similar); experience with Datadog is a plus.
• Proven experience in Agile/DevOps settings and managing production services with ITSM practices as applicable.
• Employer-sponsored health, dental, vision, life, and disability insurance.
• Retirement plan with company contributions.
• Annual company profit sharing.
• Budget for personal development and training.
• Open and collaborative work environment.
• Comprehensive 2-week onboarding plan.
• In-depth mentorship program.
Arctiq
Arctiq
Software Mind
Mediastream
Get handpicked remote jobs straight to your inbox weekly.