This is a fully remote position, open to applicants in United States.

📋 Description

• Design, create, and sustain robust ETL/ELT pipelines to ingest, transform, and distribute data across enterprise platforms.

• Construct scalable frameworks for data ingestion of structured and semi-structured data, including XBRL filings and financial datasets.

• Implement data transformation logic to facilitate analytics, reporting, and regulatory requirements.

• Ensure data pipelines are dependable, high-performing, and scalable in cloud environments.

• Utilize AI-assisted development tools to expedite the development, testing, and optimization of pipelines.

• Develop and manage data solutions utilizing AWS services (e.g., S3, Airflow, DAGs, Glue, Lambda, Redshift).

• Implement and refine Apache Iceberg table formats for large-scale, ACID-compliant data lakes.

• Support lakehouse architectures that integrate data lakes and data warehouses.

• Optimize data storage and retrieval strategies for enhanced performance and cost efficiency.

• Enable data platforms that cater to AI/ML workloads and subsequent generative AI applications.

• Design and implement CI/CD pipelines for data pipelines, infrastructure, and analytics code using various tools.

• Automate build, test, and deployment processes for ETL pipelines and data platform components.

• Apply DataOps best practices, including version control, automated testing, environment promotion, and rollback strategies.

• Ensure reproducibility, reliability, and governance of data pipeline deployments across environments.

• Integrate AI-driven testing and monitoring tools to enhance pipeline quality and minimize operational risks.

• Design and implement materialized views and other performance optimization techniques to enhance query efficiency.

• Develop pipelines for ingesting, parsing, and normalizing XBRL data.

• Apply context engineering principles to ensure data is supplemented with valuable metadata, lineage, and business context.

• Collaborate with data architects, analysts, and business stakeholders to grasp data needs and deliver effective solutions.

• Participate in Agile teams to iteratively provide data capabilities and enhancements.

⛳️ Requirements

• Bachelor’s degree in Computer Science, Engineering, Data Science, or a related discipline.

• Over 5 years of experience in data engineering, ETL development, or data platform engineering.

• Strong practical experience with ETL/ELT tools and frameworks, AWS data services (S3, Glue, Lambda, Redshift, etc.), and Apache Iceberg along with modern data lake architectures.

• Experience in designing and implementing CI/CD pipelines for data platforms and ETL workflows.

• Proven proficiency in using AI tools and AI-assisted development workflows (e.g., LLM copilots, automated code generation, pipeline optimization tools).

• Experience in processing XBRL or complex financial/regulatory datasets.

• Proficiency in SQL and Python.

• Experience with implementing materialized views and techniques for query optimization.

• Understanding of data modeling concepts and metadata management.

• Familiarity with data governance, data quality practices, and data readiness for AI/ML applications.

• Ability to thrive in Agile, DevOps-oriented environments.

• U.S. Citizenship is required; must be able to obtain and maintain a federal clearance.

🏝️ Benefits

• Health insurance

• Paid time off

• Professional development

Data Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Rate Analyst

HSE Manager

People Partner

B2B Outside Sales Consultant

Business Development Executive, Early Career – European Language Required

Statistical Programmer II

Never miss a great job!