Logo for Hugging Face

Data/Infrastructure Advocate Engineer - EMEA Remote

Key Facts

Remote From: 
Category:  Data Engineer
Full time
Mid-level (2-5 years)
English

Other Skills

  • β€’
    Communication
  • β€’
    Collaboration
  • β€’
    Problem Solving

Roles & Responsibilities

  • 3+ years in developer relations or developer advocacy
  • Strong Python skills
  • Hands-on experience with data libraries such as pandas and pyarrow
  • Practical experience with storage systems and formats

Requirements:

  • Grow and nurture the open-source data/infra community
  • Promote the Hugging Face Hub for data storage and collaboration
  • Create demos, benchmarks, and tools for data storage best practices
  • Produce high-quality tutorials, blog posts, and videos

Job description

At Hugging Face, we're on a journey to democratize good AI. We're building the fastest-growing platform for AI builders, with over 5 million users and 100k organizations who have shared more than 1M models, 300k datasets, and 300k apps. Our open-source libraries have more than 400k stars on GitHub.

About the Role

As our first Data/Infrastructure Advocate Engineer, you'll bridge the gap between cutting-edge data infrastructure and the global community of data engineers, researchers, and developers. You'll champion Xet storage on the Hugging Face Hub, helping users efficiently store, version, and collaborate on large-scale datasets. This role is for someone who thrives at the intersection of technical depth (storage, Parquet, deduplication) and community advocacy, helping define the future of open data workflows.

You'll collaborate with teams like Datasets, Hub, and Infrastructure to shape how developers interact with data on our platform, and inspire a community to build better, faster, and more scalable data pipelines.

Your main missions

  • Grow and nurture the open-source data/infra community: launch initiatives, collaborate with data-focused groups, and organize events or challenges. Engage with communities like Apache Parquet, Open Table Formats, and data engineering forums to promote best practices and Hugging Face tools.
  • Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration, curating and showcasing datasets, benchmarks, and tools like Xet.
  • Highlight use cases like efficient large-dataset updates, Parquet editing, and deduplication to demonstrate the Hub's value for data workflows.
  • Create demos, benchmarks, and tools (for example Colab notebooks) that illustrate best practices for data storage and versioning, and experiment with Xet, Parquet, and other formats.
  • Produce high-quality tutorials, blog posts, and videos that make complex topics accessible.
  • Share insights on storage optimization, dataset versioning, and deduplication to empower developers.
  • Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.
  • Make sure datasets and tools released on the Hub are well-documented, with clear examples, benchmarks, and use cases.

About You

You're already an active voice in the data and ML community. You build in public, you publish, and people follow your work on LinkedIn and X.

You're a hands-on builder who loves experimenting with data tools, storage optimization, and dataset versioning. You can take a complex topic like deduplication, compression, or Parquet editing and make it click for other developers through writing, demos, or talks. You're passionate about open source and knowledge sharing, and you thrive in fast-moving environments.

What you'll need

  • 3+ years in developer relations or developer advocacy, ideally for data engineering, infrastructure, or ML tools and platforms
  • An established public presence as a technical voice, with a track record of regularly publishing data/infra/ML content and a demonstrable, engaged audience on LinkedIn and X (Twitter)
  • A portfolio of developer-facing content you can point to: tutorials, blog posts, videos, demos, benchmarks, or conference talks
  • Hands-on experience building and engaging open-source or developer communities (Discord, GitHub, forums)
  • Strong Python skills
  • Hands-on experience with data libraries such as pandas, pyarrow, and huggingface/datasets
  • Practical experience with storage systems and formats: Parquet, Open Table Formats, and S3
  • Working knowledge of dataset versioning, deduplication, and compression
  • Ability to explain complex technical topics clearly through writing, demos, or talks
  • Fluent written and spoken English

Nice to have

  • Experience with the Hugging Face Hub and datasets ecosystem, or with Xet
  • Open-source maintainer or contributor experience
  • Familiarity with large-scale data pipelines and data engineering workflows
  • Experience producing notebooks (for example Colab) for tutorials and benchmarks

A note on fit

If you're interested in joining us but don't tick every box above, we still encourage you to apply. We're building a diverse team whose skills, experiences, and backgrounds complement one another, and we're happy to consider where you might make the biggest impact.

How to apply

At Hugging Face we believe great AI shouldn't require a massive cluster, we build for everyone, especially the GPU-poor. And because we genuinely read every application, here's a small sign that you read this one too: start your cover letter with the words β€œGPU-poor and proud of it πŸ€—β€ so we know you read the full description. No trick, no catch, it just tells us a real person is on the other side.

More about Hugging Face

We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where you feel respected and supportedβ€”regardless of who you are or where you come from. We believe this is foundational to building a great company and community, as well as the future of machine learning more broadly. Hugging Face is an equal opportunity employer, and we do not discriminate based on race, ethnicity, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or ability status.

We value development. You will work with some of the smartest people in our industry. We are an organization that has a bias for impact and is always challenging ourselves to grow continuously. We provide all employees with reimbursement for relevant conferences, training, and education.

We care about your well-being. We offer flexible working hours and remote options. We offer health, dental, and vision benefits for employees and their dependents. We also offer parental leave and flexible paid time off.

We support our employees wherever they are. While we have office spaces in NYC and Paris, we're very distributed, and all remote employees have the opportunity to visit our offices. If needed, we'll also outfit your workstation to ensure you succeed.

We want our teammates to be shareholders. All employees have company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.

Data Engineer Related jobs

Other jobs at Hugging Face

We help you get seen. Not ignored.

We help you get seen faster β€” by the right people.

πŸš€

Auto-Apply

We apply for you β€” automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.