
Data Engineer – DataOps, Infrastructure Focus
Posted Jun 20

Posted Jun 20
This is a fully remote position, open to applicants in United States.
• Design, automate, and sustain production-quality data infrastructure on AWS (e.g., S3, EMR, Glue, Lambda, Redshift) utilizing Terraform or CDK, prioritizing high availability, security, and consistent environments throughout the SDLC.
• Incorporate Claude Code and various LLM-based agents into the engineering workflow to expedite infrastructure provisioning, refactoring, and the creation of technical documentation, seamlessly integrating AI into daily development activities.
• Create, develop, and enhance CI/CD pipelines that test, deploy, and monitor dbt models and AWS Glue/Spark jobs, ensuring dependable, repeatable delivery of governed data assets.
• Establish agentic operations for DataOps—configuring AI agents to triage and conduct root-cause analysis of pipeline failures, identify cost-optimization opportunities, and proactively recognize schema drift or data quality regressions.
• Develop scalable, well-governed data pipelines and tables using Apache Iceberg, Airflow (MWAA), and Redshift, with a focus on simplicity, reusability, and clear ownership of data products.
• Implement security and compliance best practices in a regulated insurance setting, encompassing IAM automation, encryption, audit-ready logging, and adherence to enterprise RBAC/MFA standards.
• Collaborate with Product Strategy, PDO, and data science teams to ensure data platforms and features can effectively support AI-intensive products like the Agentic AI Platform, Claim Summary, and Underwriting Assistant at scale.
• Over 5 years of experience in Data Engineering, Data Operations, or Platform Engineering focused on building and managing cloud data infrastructure.
• In-depth expertise with AWS (e.g., S3, EMR, Glue, Lambda, Redshift) and infrastructure-as-code (Terraform preferred; CDK a plus), including the design of secure, resilient architectures.
• Significant experience with dbt in production environments (modeling, testing, documentation, deployment) and modern table formats like Apache Iceberg for large-scale analytics.
• Advanced SQL capabilities (performance tuning, complex joins, and window functions) and solid Python proficiency for automation, orchestration, and data engineering tasks.
• Practical experience with Apache Spark for large-scale batch or streaming workloads, preferably on AWS EMR or Glue.
• Demonstrated success in building or maintaining CI/CD pipelines (Git-based workflows, automated testing, deployment, and monitoring) for data and analytics workloads.
• Strong systems thinking and data modeling abilities (e.g., Kimball, Data Vault) along with familiarity in integrating RDBMS data via CDC patterns.
• Excellent collaborative communication skills with the capability to work effectively across product, security, and business stakeholders in a distributed setting.
• Flexible work environment
• Health and Wellness benefits
• Paid time off programs including volunteer time off
• Market-competitive pay and incentive programs
• Continual development and internal career growth opportunities
Anord Mardix
Stefanini Brasil
InVision Communications
Get handpicked remote jobs straight to your inbox weekly.