This is a fully remote position, open to applicants in Brazil.

📋 Description

• Manage and enhance the Data Lake: engage with data across its Raw, Processed, and Refined zones, which includes deduplication processes, cataloging, and optimizing storage (Parquet, Iceberg).

• Operate and oversee streaming pipelines using Kafka: establish topics, connectors, ACLs, and credentials while monitoring consumer lag for real-time streaming.

• Contribute to the development of the Data Platform API: design and maintain modules for our platform that abstract ingestion processes for users and developers (which could involve files, CDC, or data streaming), data processing on Big Data platforms (Spark / Redshift), and pipeline orchestration (Airflow).

• Investigate and troubleshoot incidents: analyze failures in pipelines, dataset loads, data duplication, and cluster errors, and assist in resolving incidents.

• Utilize and contribute to Infrastructure as Code (IaC): implement Terraform and manage AWS resources under team guidance.

• Engage in modernization and integration projects with AI tools: leverage LLMs to build agents and develop tools that incorporate AI into our systems (MCP Servers).

• Maintain documentation and monitoring: contribute to the technical documentation of our tools and architecture while upkeeping our observability tools.

⛳️ Requirements

• Strong experience (3+ years) in Data Engineering or related fields.

• Expertise in Python for creating pipelines, automation scripts, and integrations.

• Basic knowledge of software architecture.

• Practical experience with advanced SQL.

• Familiarity with Apache Airflow.

• Experience with AWS services: S3, Redshift, EMR (Spark).

• Understanding of Apache Kafka: topics, producers/consumers, and connectors (Debezium, S3 Sink).

• Proficient with Git and CI/CD workflows (Bitbucket Pipelines or similar).

• Knowledge of Data Lake architectures (Lakehouse, Medallion Architecture).

• Strong communication skills and the ability to work independently within an agile team.

• Nice to have / Differentials:

• Experience with Apache Spark (PySpark, SparkSQL).

• Familiarity with Terraform or other Infrastructure as Code tools.

• Knowledge of C# / .NET.

• Experience with Debezium for Change Data Capture (CDC).

• Familiarity with modern table formats (Apache Iceberg, Hudi, Delta).

• Understanding of Grafana for monitoring and operational dashboards.

• Experience with OpsGenie/JSM for incident and alert management.

• Familiarity with Redshift.

• Proficient in technical English for reading documentation and communicating with LATAM teams.

🏝️ Benefits

• Bradesco National Network Health Plan - extended to dependents with no per-dependent discount;

• Optional Bradesco Dental Plan;

• Flexible meal/food allowance (VR/VA) - maintained during vacation;

• Profit-sharing (PLR);

• Wellhub;

• Birthday day off;

• Home office allowance;

• Commuter allowance (VT) as needed - legal deductions apply;

• Life insurance;

• Free access to all our products - AppsClub, Discount Club, TrueCaller, BTFit, and Busuu;

• Access to internal training via digital platforms;

• Internal recognition program among employees - Bemobucks.

Data Engineer – Mid-level

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior Data Engineer

Mid-level Data Engineer

AI Data Engineer

Data Engineer

Data Engineer

Data Engineering Manager

Never miss a great job!