
Data Engineer
Posted 4 hours ago

Posted 4 hours ago
• Design, construct, and sustain scalable AWS data platforms that support both batch and streaming pipelines, analytics, and AI/ML workloads, in accordance with AWS Well-Architected best practices.
• Develop and manage data ingestion, transformation, and enrichment pipelines from both internal systems and external APIs, accommodating structured, semi-structured, unstructured, and graph data.
• Execute data normalization workflows to guarantee consistent schemas, high data quality, and dependable analytics, BI, and ML applications.
• Create and enforce data governance policies that include cataloging, lineage, access control, and auditability.
• Construct and manage knowledge graphs that model relationships among core business entities, facilitating advanced analytics and inference.
• Detect data gaps, inconsistencies, and absent relationships through strong analytical and inference capabilities.
• Integrate data from enterprise systems such as CRM and ERP platforms (Salesforce, HubSpot, SAP, NetSuite, Dynamics 365, Workday).
• Design secure data access layers for analytics, BI, ML, and subsequent applications.
• Implement monitoring, observability, and data quality checks to ensure freshness, completeness, and the health of data pipelines.
• Enhance data architectures for performance and cost-effectiveness by utilizing partitioning, indexing, compression, and storage tiering strategies.
• Develop internal tools, dashboards, and standardized scaffolding to boost visibility, maintainability, and facilitate onboarding.
• Collaborate with cross-functional teams to deliver impactful data solutions while sharing best practices, documentation, and technical guidance.
• Extensive experience in designing and operating AWS data platforms, including S3, Glue, Lake Formation, Athena, Redshift, EMR, Kinesis/MSK, DynamoDB, OpenSearch, and Neptune.
• Proficient Python programming skills for data engineering, emphasizing modular, testable, and maintainable code.
• Strong grasp of distributed data systems, encompassing batch and streaming pipelines, fault tolerance, idempotency, and event-driven architectures.
• Experience with data warehouse and lakehouse architectures, ETL/ELT pipelines, and analytical query engines.
• Practical experience with Spark, Hadoop, Hive, or Flink.
• Robust data modeling abilities, including normalized, denormalized, and graph-based models, with safe schema evolution practices.
• Advanced SQL competencies for analytics and data engineering, featuring window functions, CTEs, and query optimization techniques.
• Experience in integrating external APIs and enterprise systems, particularly CRM and ERP platforms.
• Familiarity with data governance, security, and compliance, including encryption, access control, and audit logging practices.
• Experience with implementing monitoring, observability, and data quality checks utilizing CloudWatch and CloudTrail.
• Proficiency in Infrastructure as Code using CloudFormation or Terraform.
• Strong end-to-end ownership mindset, prioritizing scalability, reliability, and long-term maintainability.
• Professional-level English communication skills, capable of articulating data architectures and trade-offs to both technical and non-technical audiences.
• Remote-First Flexibility: Experience work-life balance in a remote-first setting that enables you to work from anywhere.
• Innovative Culture: We adopt a startup mindset, fostering creativity, agility, and growth.
• Career Development: Avahi is dedicated to your professional growth, providing mentorship and opportunities for career advancement.
• Purpose-Driven Mission: Join us in making a positive impact. Avahi is committed to promoting diversity, supporting women in tech, and encouraging sustainable practices.
• Global Collaboration: Collaborate with a diverse, skilled team, exchanging insights and working together to create innovative solutions that truly matter.
Compass
Stefanini Brasil
NIVA Health
Astro Sirens LLC
Get handpicked remote jobs straight to your inbox weekly.