This is a fully remote position, open to applicants in United States.

📋 Description

• Design, automate, and sustain production-quality data infrastructure on AWS (e.g., S3, EMR, Glue, Lambda, Redshift) utilizing Terraform or CDK, prioritizing high availability, security, and consistent environments throughout the SDLC.

• Incorporate Claude Code and various LLM-based agents into the engineering workflow to expedite infrastructure provisioning, refactoring, and the creation of technical documentation, seamlessly integrating AI into daily development activities.

• Create, develop, and enhance CI/CD pipelines that test, deploy, and monitor dbt models and AWS Glue/Spark jobs, ensuring dependable, repeatable delivery of governed data assets.

• Establish agentic operations for DataOps—configuring AI agents to triage and conduct root-cause analysis of pipeline failures, identify cost-optimization opportunities, and proactively recognize schema drift or data quality regressions.

• Develop scalable, well-governed data pipelines and tables using Apache Iceberg, Airflow (MWAA), and Redshift, with a focus on simplicity, reusability, and clear ownership of data products.

• Implement security and compliance best practices in a regulated insurance setting, encompassing IAM automation, encryption, audit-ready logging, and adherence to enterprise RBAC/MFA standards.

• Collaborate with Product Strategy, PDO, and data science teams to ensure data platforms and features can effectively support AI-intensive products like the Agentic AI Platform, Claim Summary, and Underwriting Assistant at scale.

⛳️ Requirements

• Over 5 years of experience in Data Engineering, Data Operations, or Platform Engineering focused on building and managing cloud data infrastructure.

• In-depth expertise with AWS (e.g., S3, EMR, Glue, Lambda, Redshift) and infrastructure-as-code (Terraform preferred; CDK a plus), including the design of secure, resilient architectures.

• Significant experience with dbt in production environments (modeling, testing, documentation, deployment) and modern table formats like Apache Iceberg for large-scale analytics.

• Advanced SQL capabilities (performance tuning, complex joins, and window functions) and solid Python proficiency for automation, orchestration, and data engineering tasks.

• Practical experience with Apache Spark for large-scale batch or streaming workloads, preferably on AWS EMR or Glue.

• Demonstrated success in building or maintaining CI/CD pipelines (Git-based workflows, automated testing, deployment, and monitoring) for data and analytics workloads.

• Strong systems thinking and data modeling abilities (e.g., Kimball, Data Vault) along with familiarity in integrating RDBMS data via CDC patterns.

• Excellent collaborative communication skills with the capability to work effectively across product, security, and business stakeholders in a distributed setting.

🏝️ Benefits

• Flexible work environment

• Health and Wellness benefits

• Paid time off programs including volunteer time off

• Market-competitive pay and incentive programs

• Continual development and internal career growth opportunities

Data Engineer – DataOps, Infrastructure Focus

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior BI Data Engineer

Data Architect, AWS

Data Engineer

Data Engineer – Senior (GCP)

Lead Data Engineer – Data Architect

Senior Data Engineer – Microsoft Fabric

Never miss a great job!