
Senior Databricks Engineer
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in United States.
• Ingestion & Transformation: Design and enhance high-volume ETL/ELT pipelines utilizing Delta Live Tables (DLT) and PySpark, ensuring data integrity throughout the Bronze, Silver, and Gold layers.
• Workflow Orchestration: Create and sustain advanced pipelines with Databricks Workflows or Airflow, emphasizing modularity, reusability, and automated error management.
• Streaming & Real-time Integration: Establish real-time data flows using Structured Streaming and Kafka/Event Hubs to provide immediate data accessibility for downstream use.
• Data Security & Privacy: Implement data anonymization and detailed access controls to comply with global regulations (GDPR/CCPA/HIPAA).
• DataOps & DevOps: Apply CI/CD practices using Databricks Asset Bundles (DABs), Terraform, and Git to automate environment consistency and deployment processes.
• Open Table Formats: Oversee and enhance Delta Lake storage, leveraging advanced features such as Liquid Clustering, Z-Ordering, and Change Data Feed (CDF).
• Compute Engine Optimization: Enhance cost efficiency and performance by fine-tuning Spark configurations, utilizing the Photon engine, and managing Serverless SQL Warehouses.
• Observability & Monitoring: Incorporate thorough monitoring and alert systems (e.g., Databricks System Tables, Grafana, or Splunk) to quickly detect bottlenecks and resolve production challenges.
• 6+ years of hands-on, progressive experience in Data Engineering, with a minimum of 5 years specifically focused on the Databricks platform.
• Architectural Understanding: Proficient knowledge of Medallion Architecture, Data Vault 2.0 or Dimensional Modeling, and contemporary Lakehouse design patterns.
• Scale Expertise: Demonstrated success in developing and managing large-scale data infrastructure (Petabyte-scale) within cloud-native environments.
• Industry Experience: Background in the Insurance or Financial Services sector is preferred, particularly in claims, policy, or risk data.
• Technical Toolset:
• Cloud Environment: Azure (preferred), AWS.
• Databricks Stack: Unity Catalog, Delta Live Tables, Databricks SQL, MLflow.
• Core Languages: Expert-level SQL, Python, and PySpark.
• Supporting Tools: dbt (Databricks adapter), Git, and Orchestration tools.
• Work From Home
Ameresco
Rockwell Automation
AM53 Smart Solutions
Get handpicked remote jobs straight to your inbox weekly.