This is a fully remote position, open to applicants in India.

📋 Description

• Design, construct, and maintain Databricks data pipelines (ETL/ELT) for data ingestion, transformation, and orchestration using Spark/Delta Lake/Databricks Workflows.

• Operationalize machine learning models by creating inference pipelines that utilize models developed by data scientists (batch or real-time), ensuring alignment between training and inference environments.

• Guarantee data reliability, quality, and observability through effective validation, monitoring, alerting, and automated recovery strategies.

• Work collaboratively with data scientists to transition models to production, oversee model deployment lifecycles, and enhance inference efficiency and cost-effectiveness.

• Apply best-practice DevOps/MLOps methodologies such as CI/CD for pipelines, model version control, environment promotion, and infrastructure-as-code.

• Enhance performance and reduce costs across compute clusters, jobs, and storage layers.

• Establish and manage the enterprise data catalog, encompassing schema design, table ownership, lineage, governance, and documentation utilizing Unity Catalog.

• Familiarity with various aspects of Databricks infrastructure.

• Proficiency in developing BI dashboards and visualizations.

• Experience in coding agents and adhering to best practices (e.g., spec-driven development).

⛳️ Requirements

• Over 8 years of experience.

• Experience with the Databricks platform.

• Proficient in Python development for data processing and ETL pipelines.

• Knowledge of Unity Catalog.

• Familiarity with AWS data services (S3, IAM, VPC, and potentially Glue/Lambda).

• Understanding of data lake/lakehouse architecture patterns.

• Experience in building dashboards.

• RESTful API design and development experience (Flask, FastAPI, or similar).

• Knowledge of authentication/authorization patterns (OAuth, API keys, IAM roles).

• Skills in query optimization and performance tuning.

• Experience with PySpark optimization.

• Background in ML/AI pipeline development.

• Familiarity with Databricks AI/BI.

🏝️ Benefits

• Competitive salary and performance-based incentives.

• Comprehensive health and wellness benefits.

• Opportunities for professional development and growth.

• Flexible work arrangements and a supportive company culture.

Fullstack Data Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior Data Engineer

Mid-level Data Engineer

AI Data Engineer

Data Engineer

Data Engineer

Data Engineering Manager

Never miss a great job!