
Data Engineer β Python, PySpark, AWS Glue, Amazon Athena, SQL, Apache Airflow
Posted May 23

Posted May 23
This is a fully remote position, open to applicants in Pakistan.
β’ Develop, enhance, and scale data pipelines and infrastructure utilizing Python, TypeScript, Apache Airflow, PySpark, AWS Glue, and Snowflake.
β’ Create, operationalize, and oversee data ingestion and transformation workflows, including DAGs, alert mechanisms, retries, SLAs, lineage, and cost management.
β’ Partner with platform and AI/ML teams to automate data ingestion, validation, and real-time computing workflows; aim towards establishing a feature store.
β’ Incorporate pipeline health and metrics into engineering dashboards to ensure comprehensive visibility and observability.
β’ Model data and execute efficient, scalable transformations within Snowflake and PostgreSQL.
β’ Develop reusable frameworks and connectors to standardize the internal processes of data publishing and consumption.
β’ Over 4 years of hands-on experience in production data engineering.
β’ Extensive, practical knowledge of Apache Airflow, AWS Glue, PySpark, and Python-based data pipelines.
β’ Strong SQL proficiency and experience managing PostgreSQL in active environments.
β’ Comprehensive understanding of cloud-native data workflows (preferably AWS) and pipeline observability (including metrics, logging, tracing, and alerting).
β’ Demonstrated experience managing data pipelines from start to finish: design, implementation, testing, deployment, monitoring, and iteration.
β’ Fully remote position.
β’ Compensation will be in USD.
β’ Work hours are synchronized with the EST time zone (9 AM to 6 PM EST) or PT time zone.
Confitec
DOMVS iT
Anyone AI
FCamara Consulting & Training
Get handpicked remote jobs straight to your inbox weekly.