
Senior Data/ML Engineer
Posted 6 days ago

Posted 6 days ago
This is a fully remote position, open to applicants in Brazil.
• Design, develop, and maintain scalable data pipelines that facilitate the ingestion, transformation, and delivery of data into centralized feature stores, model-training workflows, and real-time inference services.
• Construct and enhance workflows for extracting, storing, and retrieving semantic representations of unstructured data to support advanced search and retrieval functionalities.
• Architect and implement streamlined analytics and dashboarding solutions that provide a natural language query experience along with AI-driven insights.
• Define and execute processes for managing prompt engineering techniques, orchestration workflows, and model fine-tuning routines that enhance conversational interfaces.
• Oversee vector data stores and develop effective indexing strategies to support retrieval-augmented generation (RAG) workflows.
• Collaborate with data stakeholders to collect requirements for language-model projects and convert these into scalable solutions.
• Create and maintain detailed documentation for all data processes, workflows, and model deployment routines.
• Must be open to staying updated and learning about emerging techniques in data engineering, MLOps, and LLM operations.
• 8+ years of experience in Data Engineering, including 2+ years concentrating on MLOps.
• Strong English communication abilities.
• Effective oral and written communication skills for engaging with the BI team and user community.
• Proven experience in using Python for data engineering tasks, encompassing transformation, advanced data manipulation, and large-scale data processing.
• Profound understanding of vector databases and RAG architectures, and their role in driving semantic retrieval workflows.
• Proficient in integrating open-source LLM frameworks into data engineering workflows for comprehensive model training, customization, and scalable inference.
• Experience with cloud platforms such as AWS or Azure Machine Learning for managed LLM deployments.
• Practical experience with big data technologies including Apache Spark, Hadoop, and Kafka for distributed processing and real-time data ingestion.
• Experience in designing complex data pipelines that extract data from RDBMS, JSON, API, and flat file sources.
• Demonstrated expertise in SQL and PLSQL programming, with advanced knowledge in Business Intelligence and data warehousing methodologies, as well as hands-on experience with one or more relational database systems and cloud-based database services like Snowflake/Redshift.
• Familiarity with software engineering principles and experience working within Unix/Linux/Windows operating systems, along with exposure to Agile methodologies.
• Proficient in version control systems, with experience managing code repositories, branching, merging, and collaborating in a distributed development environment.
• Interest in business operations and a comprehensive understanding of how robust BI systems enhance corporate profitability by empowering data-driven decision-making and strategic insights.
• Please attach CV in English.
• The interview process will be conducted in English.
• Only accepting applicants from LATAM.
Provectus
Mercafacil
Hyatt
Scopic
Get handpicked remote jobs straight to your inbox weekly.