This is a fully remote position, open to applicants in United States.

📋 Description

• Ingestion & Transformation: Design and enhance high-volume ETL/ELT pipelines utilizing Delta Live Tables (DLT) and PySpark, ensuring data integrity throughout the Bronze, Silver, and Gold layers.

• Workflow Orchestration: Create and sustain advanced pipelines with Databricks Workflows or Airflow, emphasizing modularity, reusability, and automated error management.

• Streaming & Real-time Integration: Establish real-time data flows using Structured Streaming and Kafka/Event Hubs to provide immediate data accessibility for downstream use.

• Data Security & Privacy: Implement data anonymization and detailed access controls to comply with global regulations (GDPR/CCPA/HIPAA).

• DataOps & DevOps: Apply CI/CD practices using Databricks Asset Bundles (DABs), Terraform, and Git to automate environment consistency and deployment processes.

• Open Table Formats: Oversee and enhance Delta Lake storage, leveraging advanced features such as Liquid Clustering, Z-Ordering, and Change Data Feed (CDF).

• Compute Engine Optimization: Enhance cost efficiency and performance by fine-tuning Spark configurations, utilizing the Photon engine, and managing Serverless SQL Warehouses.

• Observability & Monitoring: Incorporate thorough monitoring and alert systems (e.g., Databricks System Tables, Grafana, or Splunk) to quickly detect bottlenecks and resolve production challenges.

⛳️ Requirements

• 6+ years of hands-on, progressive experience in Data Engineering, with a minimum of 5 years specifically focused on the Databricks platform.

• Architectural Understanding: Proficient knowledge of Medallion Architecture, Data Vault 2.0 or Dimensional Modeling, and contemporary Lakehouse design patterns.

• Scale Expertise: Demonstrated success in developing and managing large-scale data infrastructure (Petabyte-scale) within cloud-native environments.

• Industry Experience: Background in the Insurance or Financial Services sector is preferred, particularly in claims, policy, or risk data.

• Technical Toolset:

• Cloud Environment: Azure (preferred), AWS.

• Databricks Stack: Unity Catalog, Delta Live Tables, Databricks SQL, MLflow.

• Core Languages: Expert-level SQL, Python, and PySpark.

• Supporting Tools: dbt (Databricks adapter), Git, and Orchestration tools.

🏝️ Benefits

• Work From Home

Senior Databricks Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Energy Engineer

Project Engineer

Portuguese AIS Engineer

Staff Forward Deployed Engineer

3rd Line Engineer

Controls Engineer

Never miss a great job!