• Develop and sustain scalable data pipelines utilizing Spark, Databricks, and cloud technologies.

• Create data models tailored for analytics, machine learning, and artificial intelligence applications.

• Foster the adoption of AI tools and autonomous workflows within the data engineering team.

• Discover and apply methods to enhance engineering efficiency through AI.

• Prototype and expand AI-assisted development methodologies.

• Serve as a primary resource for AI experimentation and knowledge dissemination.

• Aid in establishing best practices and contribute to an AI-centric community or guild.

• Construct pipelines that support machine learning models, large language model (LLM) applications, and AI workflows.

• Ensure high standards of data quality, observability, and reliability.

• Collaborate with Product, Data Science, ML/AI, and DevOps teams.

• Minimum of 3 years of professional experience in data engineering.

• Strong expertise in SQL and Python, focusing on development and optimization.

• Practical experience with Spark/PySpark; familiarity with Databricks is advantageous.

• Experience with cloud data platforms, preferably Azure (including ADF, Synapse, ADLS, Event Hub).

• Comprehensive understanding of ETL/ELT processes, data modeling, and data warehousing concepts.

• Familiarity with orchestration tools such as Airflow and ADF.

• Knowledge of reliability, performance, and production-ready systems.

• Practical experience utilizing AI coding tools (e.g., Copilot, Cursor, Claude Code) in real-world workflows.

• Proven track record of delivering at least one project with AI-assisted development.

• Capability to structure tasks for AI tools and critically assess their outputs.

• Upper Intermediate proficiency in English for effective communication.

• WILL BE A PLUS

• Experience in configuring AI development environments (agents, integrations, workflows).

• Familiarity with LLMs, embeddings, and RAG architectures.

• Experience with vector databases (e.g., pgvector, FAISS).

• Knowledge of AI/agent frameworks (e.g., LangChain, LlamaIndex).

• Experience with dbt, Kafka, and business intelligence tools.

• Familiarity with data quality tools (e.g., Great Expectations, Soda).

• Multi-cloud experience (AWS/GCP).

• Interest in advanced topics such as evaluation, reranking, drift detection, and synthetic data.

• Contributions to AI/data tooling or open source projects.

• Employees have the option to work remotely.

AI Data Engineer

People also viewed