
Lead Data Engineer – GenAI, LLM Applications
Posted May 22

Posted May 22
This is a fully remote position, open to applicants in India.
• Create, develop, and sustain scalable software architectures and data pipelines that connect with both analytical and operational systems.
• Produce clean, reusable, and thoroughly tested Python code utilizing frameworks such as Flask and associated libraries.
• Utilize AI-assisted development tools, including GitHub Copilot and LangChain, to conceptualize, construct, and integrate LLM-powered solutions like retrieval-augmented generation (RAG) pipelines, intelligent agents, and automated workflows employing AWS Bedrock or comparable services.
• Develop and enhance complex SQL queries across Oracle, MS SQL Server, PostgreSQL, and Snowflake, including procedures, functions, views, analytical functions, and dynamic SQL.
• Design and execute ETL pipelines using Snowflake and relevant data processing technologies.
• Implement scheduling and orchestration using Apache Airflow or analogous workflow orchestration frameworks.
• Establish and uphold data quality frameworks, versioning, and governance practices to ensure data reliability, integrity, and compliance.
• Diagnose production issues and promote continuous enhancement in software quality, performance, and reliability.
• Deploy, manage, and provide support for solutions on AWS, encompassing storage, compute, and pipeline services.
• Construct source-to-target mappings and assist with data and code migration initiatives.
• Collaborate with stakeholders to collect requirements, convert business needs into technical solutions, and create clear, well-structured documentation.
• Work alongside product managers, analysts, and cross-functional teams to deliver data-driven insights and reporting using tools like Plotly and Power BI.
• Bachelor’s degree or higher in Computer Science, Information Technology, or a related technical discipline.
• Over 5 years of professional experience in software engineering, data engineering, or data-centric development roles.
• Strong expertise in Python, including frameworks and libraries such as Django or Flask, pandas, NumPy, Plotly, and ag-Grid.
• Proficient in SQL with Oracle, MS SQL Server, PostgreSQL, and/or Snowflake.
• Demonstrated experience in writing complex SQL, including analytical and window functions, subqueries, all types of joins, DML/DDL/TCL statements, CASE expressions, and performance tuning.
• Familiarity with cloud platforms, preferably AWS (S3, EC2, Secrets Manager, Bedrock, Lambda).
• Experience with AI-assisted development tools and frameworks like GitHub Copilot and LangChain for creating LLM-powered applications and workflows.
• Proficient with Git-based version control systems and CI/CD pipelines.
• Understanding of data modeling concepts for both structured and unstructured data.
• Strong analytical thinking, problem-solving skills, and effective communication abilities.
• Willingness to engage in all phases of the SDLC, including requirements gathering, design, development, deployment, and production support.
• Preferred qualifications include exposure to the clinical trial lifecycle or clinical data management, data visualization tools (Plotly, Power BI), front-end technologies (HTML5, CSS3, JavaScript), collaboration tools (Jira, Confluence, Microsoft Teams), and practical experience in data analysis or cleansing using programming languages, SQL, and Excel.
• Competitive compensation in line with local market standards
• Comprehensive health and wellness benefits
• Paid time off and company holidays
• Opportunities for professional growth, learning, and career advancement
• Flexibility to work from Bangalore or remotely within India while collaborating with global teams
Aimpoint Digital
Power Digital Marketing
Get handpicked remote jobs straight to your inbox weekly.