This is a fully remote position, open to applicants in India.

📋 Description

• Create, develop, and sustain scalable software architectures and data pipelines that connect with both analytical and operational systems.

• Produce clean, reusable, and thoroughly tested Python code utilizing frameworks such as Flask and associated libraries.

• Utilize AI-assisted development tools, including GitHub Copilot and LangChain, to conceptualize, construct, and integrate LLM-powered solutions like retrieval-augmented generation (RAG) pipelines, intelligent agents, and automated workflows employing AWS Bedrock or comparable services.

• Develop and enhance complex SQL queries across Oracle, MS SQL Server, PostgreSQL, and Snowflake, including procedures, functions, views, analytical functions, and dynamic SQL.

• Design and execute ETL pipelines using Snowflake and relevant data processing technologies.

• Implement scheduling and orchestration using Apache Airflow or analogous workflow orchestration frameworks.

• Establish and uphold data quality frameworks, versioning, and governance practices to ensure data reliability, integrity, and compliance.

• Diagnose production issues and promote continuous enhancement in software quality, performance, and reliability.

• Deploy, manage, and provide support for solutions on AWS, encompassing storage, compute, and pipeline services.

• Construct source-to-target mappings and assist with data and code migration initiatives.

• Collaborate with stakeholders to collect requirements, convert business needs into technical solutions, and create clear, well-structured documentation.

• Work alongside product managers, analysts, and cross-functional teams to deliver data-driven insights and reporting using tools like Plotly and Power BI.

⛳️ Requirements

• Bachelor’s degree or higher in Computer Science, Information Technology, or a related technical discipline.

• Over 5 years of professional experience in software engineering, data engineering, or data-centric development roles.

• Strong expertise in Python, including frameworks and libraries such as Django or Flask, pandas, NumPy, Plotly, and ag-Grid.

• Proficient in SQL with Oracle, MS SQL Server, PostgreSQL, and/or Snowflake.

• Demonstrated experience in writing complex SQL, including analytical and window functions, subqueries, all types of joins, DML/DDL/TCL statements, CASE expressions, and performance tuning.

• Familiarity with cloud platforms, preferably AWS (S3, EC2, Secrets Manager, Bedrock, Lambda).

• Experience with AI-assisted development tools and frameworks like GitHub Copilot and LangChain for creating LLM-powered applications and workflows.

• Proficient with Git-based version control systems and CI/CD pipelines.

• Understanding of data modeling concepts for both structured and unstructured data.

• Strong analytical thinking, problem-solving skills, and effective communication abilities.

• Willingness to engage in all phases of the SDLC, including requirements gathering, design, development, deployment, and production support.

• Preferred qualifications include exposure to the clinical trial lifecycle or clinical data management, data visualization tools (Plotly, Power BI), front-end technologies (HTML5, CSS3, JavaScript), collaboration tools (Jira, Confluence, Microsoft Teams), and practical experience in data analysis or cleansing using programming languages, SQL, and Excel.

🏝️ Benefits

• Competitive compensation in line with local market standards

• Comprehensive health and wellness benefits

• Paid time off and company holidays

• Opportunities for professional growth, learning, and career advancement

• Flexibility to work from Bangalore or remotely within India while collaborating with global teams

Lead Data Engineer – GenAI, LLM Applications

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior Data Engineer

Mid-level Data Engineer

AI Data Engineer

Data Engineer

Data Engineer

Data Engineering Manager

Never miss a great job!