This is a fully remote position, open to applicants in France.

📋 Description

• Cultivate and enhance the open-source data and infrastructure community by initiating projects, collaborating with data-centric organizations, and organizing events or challenges. Engage with communities such as Apache Parquet, Open Table Formats, and data engineering forums to advocate for best practices and Hugging Face tools.

• Advocate for the Hugging Face Hub as the premier platform for data storage, versioning, and collaboration, curating and highlighting datasets, benchmarks, and tools such as Xet.

• Illustrate use cases including efficient updates for large datasets, Parquet editing, and deduplication to showcase the Hub's significance for data workflows.

• Develop demonstrations, benchmarks, and tools (for instance, Colab notebooks) that exemplify best practices in data storage and versioning while experimenting with Xet, Parquet, and other formats.

• Create high-quality tutorials, blog posts, and videos that simplify complex topics for broader accessibility.

• Provide insights on optimizing storage, dataset versioning, and deduplication to empower developers.

• Engage actively in online communities (Discord, GitHub, forums) to showcase contributions, respond to inquiries, and encourage collaboration.

• Ensure that datasets and tools released on the Hub are thoroughly documented with clear examples, benchmarks, and use cases.

⛳️ Requirements

• Over 3 years of experience in developer relations or developer advocacy, preferably within data engineering, infrastructure, or ML tools and platforms.

• A well-established public presence as a technical voice, with a history of regularly publishing content related to data, infrastructure, or ML, and a demonstrable, engaged audience on LinkedIn and X (Twitter).

• A portfolio of developer-oriented content including tutorials, blog posts, videos, demos, benchmarks, or conference presentations.

• Practical experience in building and engaging open-source or developer communities on platforms such as Discord, GitHub, and forums.

• Proficient in Python.

• Hands-on experience with data libraries like pandas, pyarrow, and huggingface/datasets.

• Practical knowledge of storage systems and formats, including Parquet, Open Table Formats, and S3.

• Familiarity with dataset versioning, deduplication, and compression techniques.

• Ability to clearly articulate complex technical concepts through written content, demonstrations, or presentations.

• Proficient in both written and spoken English.

🏝️ Benefits

• Comprehensive health, dental, and vision benefits for employees and their dependents.

• Parental leave.

• Flexible paid time off policies.

• Flexible working hours.

• Remote work options available.

• Reimbursement for relevant conferences, training, and educational pursuits.

• Company equity included as part of the compensation package.

Data Engineer – Infrastructure Advocate

📋 Description

⛳️ Requirements

🏝️ Benefits

People also viewed

Forward Deployed Engineer

Professional Services Engineer

Senior Cisco CUCM Engineer

Ingeniero de Observabilidad IA

Field Services Engineer

Technical Services Engineer

Never miss a great job!