This is a fully remote position, open to applicants in Brazil.

📋 Description

• Design, develop, and maintain scalable data pipelines that facilitate the ingestion, transformation, and delivery of data into centralized feature stores, model-training workflows, and real-time inference services.

• Construct and enhance workflows for extracting, storing, and retrieving semantic representations of unstructured data to support advanced search and retrieval functionalities.

• Architect and implement streamlined analytics and dashboarding solutions that provide a natural language query experience along with AI-driven insights.

• Define and execute processes for managing prompt engineering techniques, orchestration workflows, and model fine-tuning routines that enhance conversational interfaces.

• Oversee vector data stores and develop effective indexing strategies to support retrieval-augmented generation (RAG) workflows.

• Collaborate with data stakeholders to collect requirements for language-model projects and convert these into scalable solutions.

• Create and maintain detailed documentation for all data processes, workflows, and model deployment routines.

• Must be open to staying updated and learning about emerging techniques in data engineering, MLOps, and LLM operations.

⛳️ Requirements

• 8+ years of experience in Data Engineering, including 2+ years concentrating on MLOps.

• Strong English communication abilities.

• Effective oral and written communication skills for engaging with the BI team and user community.

• Proven experience in using Python for data engineering tasks, encompassing transformation, advanced data manipulation, and large-scale data processing.

• Profound understanding of vector databases and RAG architectures, and their role in driving semantic retrieval workflows.

• Proficient in integrating open-source LLM frameworks into data engineering workflows for comprehensive model training, customization, and scalable inference.

• Experience with cloud platforms such as AWS or Azure Machine Learning for managed LLM deployments.

• Practical experience with big data technologies including Apache Spark, Hadoop, and Kafka for distributed processing and real-time data ingestion.

• Experience in designing complex data pipelines that extract data from RDBMS, JSON, API, and flat file sources.

• Demonstrated expertise in SQL and PLSQL programming, with advanced knowledge in Business Intelligence and data warehousing methodologies, as well as hands-on experience with one or more relational database systems and cloud-based database services like Snowflake/Redshift.

• Familiarity with software engineering principles and experience working within Unix/Linux/Windows operating systems, along with exposure to Agile methodologies.

• Proficient in version control systems, with experience managing code repositories, branching, merging, and collaborating in a distributed development environment.

• Interest in business operations and a comprehensive understanding of how robust BI systems enhance corporate profitability by empowering data-driven decision-making and strategic insights.

🏝️ Benefits

• Please attach CV in English.

• The interview process will be conducted in English.

• Only accepting applicants from LATAM.

Senior Data/ML Engineer

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Senior ML Engineer, LLMs, AWS

Mid-level Data Analyst – Statistics, Machine Learning

Senior Machine Learning Engineer

Machine Learning Engineer

Senior AI/ML Engineer

Machine Learning Engineer

Never miss a great job!