
AI Data Engineer
Posted 18 hours ago

Posted 18 hours ago
• Develop and sustain scalable data pipelines utilizing Spark, Databricks, and cloud technologies.
• Create data models tailored for analytics, machine learning, and artificial intelligence applications.
• Foster the adoption of AI tools and autonomous workflows within the data engineering team.
• Discover and apply methods to enhance engineering efficiency through AI.
• Prototype and expand AI-assisted development methodologies.
• Serve as a primary resource for AI experimentation and knowledge dissemination.
• Aid in establishing best practices and contribute to an AI-centric community or guild.
• Construct pipelines that support machine learning models, large language model (LLM) applications, and AI workflows.
• Ensure high standards of data quality, observability, and reliability.
• Collaborate with Product, Data Science, ML/AI, and DevOps teams.
• Minimum of 3 years of professional experience in data engineering.
• Strong expertise in SQL and Python, focusing on development and optimization.
• Practical experience with Spark/PySpark; familiarity with Databricks is advantageous.
• Experience with cloud data platforms, preferably Azure (including ADF, Synapse, ADLS, Event Hub).
• Comprehensive understanding of ETL/ELT processes, data modeling, and data warehousing concepts.
• Familiarity with orchestration tools such as Airflow and ADF.
• Knowledge of reliability, performance, and production-ready systems.
• Practical experience utilizing AI coding tools (e.g., Copilot, Cursor, Claude Code) in real-world workflows.
• Proven track record of delivering at least one project with AI-assisted development.
• Capability to structure tasks for AI tools and critically assess their outputs.
• Upper Intermediate proficiency in English for effective communication.
• WILL BE A PLUS
• Experience in configuring AI development environments (agents, integrations, workflows).
• Familiarity with LLMs, embeddings, and RAG architectures.
• Experience with vector databases (e.g., pgvector, FAISS).
• Knowledge of AI/agent frameworks (e.g., LangChain, LlamaIndex).
• Experience with dbt, Kafka, and business intelligence tools.
• Familiarity with data quality tools (e.g., Great Expectations, Soda).
• Multi-cloud experience (AWS/GCP).
• Interest in advanced topics such as evaluation, reranking, drift detection, and synthetic data.
• Contributions to AI/data tooling or open source projects.
• Employees have the option to work remotely.
SmartLight Analytics
CloudSmiths
BPCS, Comprehensive marketing solutions, ltd.
Get handpicked remote jobs straight to your inbox weekly.