
Data Lead – Architecture, Ingestion, Platform Enablement
Posted Jun 19

Posted Jun 19
This is a fully remote position, open to applicants in Virginia.
• Convert RFQ scope and OEB objectives into a comprehensive solution blueprint that encompasses ingestion, storage, access, and analytics for DB/DC/Health benefits data.
• Create a secure, cloud-based Data Hub architecture (utilizing AWS, Databricks, Redshift, Glue, and cataloging) that is optimized for performance, scalability, data quality, and resilience.
• Develop logical and physical data models for structured and semi-structured sources from third-party recordkeepers and internal systems, ensuring testability for IV&V.
• Establish standards for file formats, schemas, metadata, data versioning, and reconciliation patterns to support analytics, IV&V, and audit preparedness.
• Design and prototype ingestion pipelines, transformation jobs, and storage layouts that align with architectural and IV&V requirements.
• Provide guidance on query optimization, clustering/partitioning, workload management, cost controls, and monitoring for DB/DC/Health datasets within Redshift and Databricks.
• Assist in troubleshooting data quality or performance issues identified during IV&V and testing phases.
• Collaborate with Governance, Security, and IV&V leads to ensure successful delivery within the June–November timeframe and support the goal of a November 2026 go-live.
• Permanent Residency or US Citizenship is required.
• A Bachelor's degree in Computer Science, Information Systems, Business IT Management, or equivalent practical experience is needed.
• A minimum of 8 years in data architecture or data engineering, with at least 3 years in roles focused on cloud data platforms.
• Hands-on experience with AWS data services, including S3, Glue, Glue Data Catalog, Redshift, and Athena, or equivalent.
• Databricks experience is mandatory, with a strong preference for familiarity with Unity Catalog, Delta Lake, and ingestion pipeline design.
• Expertise in designing RBAC and ABAC security models for sensitive or regulated data environments is essential.
• Strong skills in data modeling (both logical and physical), schema design, and metadata management.
• Proven experience in ingesting structured and semi-structured data from third-party or external recordkeeper systems.
• Capability to produce architecture documentation and standards that can be assessed and validated by an independent IV&V team.
• Medical coverage.
• Dental coverage.
• 401(k) matching program.
• Flexible spending options.
• Up to 27 days of paid time off per year.
• Opportunities for remote work.
• Parental leave.
• Wellness benefits.
Humana
Binance.US
10x Genomics
Dynatron Software, Inc.
Get handpicked remote jobs straight to your inbox weekly.