Remotery

Mid-Level Data Engineer

Posted 5 days ago

This is a fully remote position, open to applicants anywhere in the world.

📋 Description

• Pipeline Development: Create, construct, test, and sustain scalable data pipelines (both batch and streaming) along with ETL/ELT processes.

• AI Infrastructure: Develop and oversee data pipelines that emphasize the Machine Learning lifecycle, incorporating both structured and unstructured data.

• Quality and Governance: Maintain data quality, integrity, and security by implementing governance and data curation practices for use in predictive models and large language models (LLMs).

• Performance Optimization: Track data flow performance and refine complex queries to minimize costs and processing time.

• Collaboration with AI Teams: Collaborate closely with Data Scientists and Machine Learning Engineers to comprehend requirements and facilitate large-scale data utilization.


⛳️ Requirements

• 2–4 years of demonstrated experience as a Data Engineer.

• Proficient in SQL (modeling, optimization, and processing) and Python (data manipulation using Pandas, PySpark, etc.).

• Practical experience with cloud platforms (AWS, GCP, or Azure) and Data Warehouse services (BigQuery, Redshift, or Snowflake).

• Hands-on experience structuring unstructured data (text, PDFs, images) and integrating with vector databases (such as Pinecone, Milvus, Chroma, pgvector, or Weaviate) to support semantic search and RAG (Retrieval-Augmented Generation) systems.

• Familiarity with workflow orchestrators (preferably Apache Airflow).

• Understanding of relational and NoSQL databases.

• Experience working with APIs and integrating various systems.

• Knowledge of natural language processing (NLP) concepts and embeddings.

• Assertive Communication: Capable of interacting with both business and technical teams, clearly explaining technological limitations and opportunities to non-technical stakeholders.

• Critical Thinking and Business Awareness: Focused on identifying root causes of structural issues and prioritizing tasks that provide the highest value and cost efficiency for the company.

• Proactivity/Autonomy and Ownership: Take responsibility for pipelines, anticipate failures, proactively suggest enhancements, and document architectural decisions.

• Collaborative Spirit: Empathetic towards the needs of data consumers and willing to share knowledge with the team.

• Adaptability: Resilient in managing scope changes, new data sources, or technology advancements while maintaining a focus on delivery.


🏝️ Benefits

• Care for your health: Medical plan, Dental plan, Telemedicine, and Life Insurance.

• Customizable multi-benefit program (Flash).

• Rest is essential: Paid time off.

• Celebrate your day: Day off on your birthday!

• We offer Gympass to promote a healthy routine.

• Autonomy and flexibility.

• Workplace exercise and Quality of Life initiatives.

• Training and development program, Academia X.

• Start your self-awareness journey: Profiler and behavioral mapping.

People also viewed

Persona2 days ago

Software Engineer, Data Products

US flagCalifornia OnlyFull-timeData Engineer$130k – $220k/year
ApplyView job
WellSky3 days ago

Senior Data Engineer

US flagUnited States OnlyFull-timeData Engineer
ApplyView job
Fortive3 days ago

Senior Data Engineer

BR flagBrazil OnlyFull-timeData Engineer
ApplyView job
NVIDIA5 days ago

Senior Software Engineer, DGXC Data Services

US flagCalifornia OnlyFull-timeData Engineer$152k – $241.5k/year
ApplyView job
Hewlett Packard Enterprise5 days ago

Senior Technical Marketing Engineer – Apstra Data Center

US flagCalifornia, +4 more statesFull-timeData Engineer$136.5k – $276.5k/year
ApplyView job
Payabli5 days ago

Staff Data Engineer

US flagUnited States OnlyFull-timeData Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers