Remotery

Mid-Level Data Engineer

Posted 2 days ago

This is a fully remote position, open to applicants in United States.

📋 Description

• Collaborate with senior engineers to create new ETL pipelines and data ingestion processes utilizing AWS Glue (Spark-based, PySpark), MWAA (Airflow), Lambda, and SNS.

• Incorporate the agency's ETL Common Library into Glue jobs to standardize orchestration, manage error handling, record metadata, and send SNS notifications for all successful and erroneous job events.

• Ingest structured and semi-structured datasets (CSV, XML, JSON, Avro, pipe-delimited) into S3 landing, raw, and curated zones using Apache Iceberg tables.

• Set up static ETL metadata in the centralized PostgreSQL metadata store; ensure that dynamic metadata captures job status and timestamps for all crucial execution steps.

• Oversee assigned production jobs and engage in operations support rotations.

• Ensure that ETL Load Reports are updated in real-time and ETL Gap Reports are refreshed weekly.

• Create and sustain materialized views and semantic layer objects in Trino and Athena to enhance query performance and maintain consistent business logic.

• Generate and keep up-to-date required documentation for each assigned dataset: Business Requirements, ETL Design Documents, Data Models, Data Dictionaries, Mapping Documents, Deployment Documents, O&M Guides, and ETL Test Plans.

• Develop unit and integration tests to meet a minimum code coverage threshold of 90%; conduct security scans at least once per sprint.

• Deploy ETL resources using CloudFormation templates via the agency's CICD pipeline.

• Assist in the transition of ETL jobs from other agency teams and participate in disaster recovery exercises.


⛳️ Requirements

• US Citizenship is mandatory.

• A Bachelor's Degree is required.

• A minimum of 3-5 years of relevant experience is necessary.

• Practical experience with Python (PEP 8), PySpark, and SQL for ETL pipeline development.

• Familiarity with AWS services, including Glue, S3, MWAA (Airflow), Lambda, SNS, and SQS.

• Knowledge of Apache Iceberg, Parquet, and ORC file formats, as well as S3 data lake zone concepts.

• Experience with PostgreSQL and basic knowledge of Redshift or Oracle.

• Understanding of Trino or Athena for query and semantic layer development.

• Experience with CloudFormation, GitHub branching workflows, and CI/CD-integrated deployments.

• Ability to create comprehensive ETL documentation, including data models (in Mermaid format) and data dictionaries.

• Understanding of ETL metadata concepts, including static and dynamic metadata, load reports, and gap reports.

• Experience in agile development settings with sprint-based delivery.

• Familiarity with IV&V and/or User Acceptance Testing (UAT) processes in a federal or technical program environment.

• Experience with automated testing frameworks; capability to write unit and integration tests that meet defined code coverage thresholds.

• Knowledge of FISMA, NIST 800-53, and OWASP ASVS Level 2 is a plus.

• Availability to work from 8 am to 5 pm Eastern Time, regardless of home location.

• An active federal public trust suitability determination or the ability to obtain one is required.


🏝️ Benefits

• Flexible work arrangements.

• Continuous learning opportunities.

• Professional development support.

• Special incentives for team members residing in qualified HUBZones.

People also viewed

Anchor Utility11 hours ago

Rate Analyst

US flagTexas OnlyFull-timeUncategorized
ApplyView job
Honeywell11 hours ago

HSE Manager

US flagNorth Carolina OnlyFull-timeUncategorized
ApplyView job
Cision France11 hours ago

People Partner

CA flagCanada OnlyFull-timeUncategorized$85k/year
ApplyView job
Navigate Power11 hours ago

B2B Outside Sales Consultant

US flagPennsylvania OnlyFreelanceUncategorized$50k – $250k/year
ApplyView job
TELUS11 hours ago

Business Development Executive, Early Career – European Language Required

GB flagUnited Kingdom OnlyFull-timeUncategorized
ApplyView job
Gilead Sciences11 hours ago

Statistical Programmer II

US flagUnited States OnlyFull-timeUncategorized$107.2k – $138.7k/year
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers