
Data Engineer – Mid-level
Posted Jun 3

Posted Jun 3
This is a fully remote position, open to applicants in Brazil.
• Manage and enhance the Data Lake: engage with data across its Raw, Processed, and Refined zones, which includes deduplication processes, cataloging, and optimizing storage (Parquet, Iceberg).
• Operate and oversee streaming pipelines using Kafka: establish topics, connectors, ACLs, and credentials while monitoring consumer lag for real-time streaming.
• Contribute to the development of the Data Platform API: design and maintain modules for our platform that abstract ingestion processes for users and developers (which could involve files, CDC, or data streaming), data processing on Big Data platforms (Spark / Redshift), and pipeline orchestration (Airflow).
• Investigate and troubleshoot incidents: analyze failures in pipelines, dataset loads, data duplication, and cluster errors, and assist in resolving incidents.
• Utilize and contribute to Infrastructure as Code (IaC): implement Terraform and manage AWS resources under team guidance.
• Engage in modernization and integration projects with AI tools: leverage LLMs to build agents and develop tools that incorporate AI into our systems (MCP Servers).
• Maintain documentation and monitoring: contribute to the technical documentation of our tools and architecture while upkeeping our observability tools.
• Strong experience (3+ years) in Data Engineering or related fields.
• Expertise in Python for creating pipelines, automation scripts, and integrations.
• Basic knowledge of software architecture.
• Practical experience with advanced SQL.
• Familiarity with Apache Airflow.
• Experience with AWS services: S3, Redshift, EMR (Spark).
• Understanding of Apache Kafka: topics, producers/consumers, and connectors (Debezium, S3 Sink).
• Proficient with Git and CI/CD workflows (Bitbucket Pipelines or similar).
• Knowledge of Data Lake architectures (Lakehouse, Medallion Architecture).
• Strong communication skills and the ability to work independently within an agile team.
• Nice to have / Differentials:
• Experience with Apache Spark (PySpark, SparkSQL).
• Familiarity with Terraform or other Infrastructure as Code tools.
• Knowledge of C# / .NET.
• Experience with Debezium for Change Data Capture (CDC).
• Familiarity with modern table formats (Apache Iceberg, Hudi, Delta).
• Understanding of Grafana for monitoring and operational dashboards.
• Experience with OpsGenie/JSM for incident and alert management.
• Familiarity with Redshift.
• Proficient in technical English for reading documentation and communicating with LATAM teams.
• Bradesco National Network Health Plan - extended to dependents with no per-dependent discount;
• Optional Bradesco Dental Plan;
• Flexible meal/food allowance (VR/VA) - maintained during vacation;
• Profit-sharing (PLR);
• Wellhub;
• Birthday day off;
• Home office allowance;
• Commuter allowance (VT) as needed - legal deductions apply;
• Life insurance;
• Free access to all our products - AppsClub, Discount Club, TrueCaller, BTFit, and Busuu;
• Access to internal training via digital platforms;
• Internal recognition program among employees - Bemobucks.
Aimpoint Digital
Power Digital Marketing
Get handpicked remote jobs straight to your inbox weekly.