Remotery

Data Engineer

Posted Jun 20

This is a fully remote position, open to applicants in Germany.

📋 Description

• Design, construct, and enhance comprehensive ETL pipelines for legal data across various jurisdictions, encompassing tasks such as cleaning, transformation, chunking, validation, embedding, and ingestion into vector databases.

• Engage extensively with XML-based legal data feeds: parse, validate, normalize, and convert XML structures into scalable internal schemas and unified document formats.

• Create and sustain data models and storage schemas that accommodate continuously updated datasets, ensuring consistency, scalability, and accuracy across diverse datasets and substantial volumes of data.

• Oversee the data handover and integration process from multiple internal and external data providers, including official sources, APIs, and web scraping pipelines, ensuring dependability and timely updates.

• Execute and continually enhance metadata enrichment strategies to optimize searchability, ranking quality, and relevance of legal information within vector databases.

• Establish and maintain a high-performance search and retrieval infrastructure that enables agent-based systems to invoke search functions and efficiently retrieve the most pertinent legal information.

• Collaborate with product, AI, and legal domain specialists to deliver high-quality, dependable data solutions.

• Take complete ownership of the data integration for one jurisdiction from start to finish.


⛳️ Requirements

• A minimum of 2 years of professional experience in data engineering, with involvement in successfully deployed projects.

• Proficient in Python, with experience in designing robust data pipelines.

• Experience in constructing and maintaining reliable ET and RAG pipelines, along with a solid understanding of data modeling, quality, filtering, validation, and consistency.

• Familiarity with containerization (Docker), CI/CD pipelines, and version control systems (Git).

• Strong understanding of data structures, algorithms, system design principles, and software engineering best practices.

• Expertise in working with graph databases and familiarity with developing and deploying NLP models is an advantage.

• Proficiency in English at the C2 level.


🏝️ Benefits

• Remote: 100% remote work available (with a German residence), other countries can be considered upon request.

• Working hours: Flexible working hours.

• Vacation: 26 days plus December 24th & 31st off, and an additional vacation day for each year of employment (capping at 30 days).

• Discounts: e.g., Urban Sports Club Membership, dependent on location.

• Equipment: Laptop (Lenovo or Mac), along with a €1,000 net home office setup budget (disbursed with your first salary).

People also viewed

Anord Mardix9 hours ago

Senior BI Data Engineer

GB flagUnited Kingdom OnlyFull-timeData Engineer
ApplyView job
Stefanini Brasil9 hours ago

Data Architect, AWS

BR flagBrazil OnlyFull-timeData Engineer
ApplyView job
InVision Communications9 hours ago

Data Engineer

US flagUnited States OnlyFull-timeData Engineer$100k – $110k/year
ApplyView job
Leega9 hours ago

Data Engineer – Senior (GCP)

BR flagBrazil OnlyFull-timeData Engineer
ApplyView job
Enable Data9 hours ago

Lead Data Engineer – Data Architect

IN flagIndia OnlyFull-timeData Engineer
ApplyView job
Capco9 hours ago

Senior Data Engineer – Microsoft Fabric

BR flagBrazil OnlyFull-timeData Engineer
ApplyView job

Never miss a great job!

Get handpicked remote jobs straight to your inbox weekly.

Trusted by 7,400+ designers