Offer summary

Qualifications:

Expertise in machine learning optimization, Experience with Pytorch and Kubernetes, Strong understanding of model deployment, Background in speech and audio processing.

Key responsibilities:

Collaborate with multidisciplinary teams on models

Optimize machine learning infrastructure for scaling

Job description

The Personalization team makes deciding what to play next easier and more enjoyable for every listener. From Discover Weekly to AI DJ, we’re behind some of Spotify’s most-loved features. We built them by understanding the world of music and podcasts better than anyone else. Join us and you’ll keep millions of users listening by making great recommendations – and providing valuable context – to each and every one of them.

Do you want to help Spotify invent new personalized sessions with generative voice AI to delight users? In this role, you’ll work with Spotify’s Text-to-Speech (TTS) team, Speak, to create generated voice audio that enriches users’ experience of music and podcast recommendations.

What You'll Do

Collaborate with a multidisciplinary team to optimize machine learning models for production use cases, ensuring they are highly efficient and scalable

Design and build efficient serving infrastructure for machine learning models that supports large-scale deployments across different regions

Optimize machine learning models in Pytorch or other libraries for real-time serving and production applications

Lead the effort to transition machine learning models from research and development into production, working closely with researchers and machine learning engineers

Build and maintain scalable Kubernetes clusters to manage and deploy machine learning models, ensuring reliability and performance

Implement and monitor logging metrics, diagnose infrastructure issues, and contribute to an on-call schedule to maintain production stability

Influence the technical design, architecture, and infrastructure decisions to support new and diverse machine learning architectures

Collaborate with stakeholders to drive forward initiatives related to the serving and optimization of machine learning models at scale.

Who You Are

You have a passion for speech, audio and/or generative machine learning

You have world-class expertise in optimizing machine learning models for production use cases, and extensive experience with machine learning frameworks like Pytorch

You are experienced in building efficient, scalable infrastructure to serve machine learning models, and managing Kubernetes clusters in multi-region setups

You have a strong understanding of how to bring machine learning models from research to production and are comfortable working with innovative, cutting-edge architectures

You are familiar with writing logging metrics and diagnosing production issues, and are willing to take part in an on-call schedule to maintain uptime and performance

You have a collaborative mindset, enjoy working closely with research scientists, machine learning engineers, and backend engineers to innovate and improve model deployment pipelines

You thrive in environments that require solving complex infrastructure challenges, including scaling and performance optimization

Experience with low-level machine learning libraries (e.g., Triton, CUDA) and performance optimization for custom components is a bonus

Where You'll Be

We offer you the flexibility to work where you work best! For this role, you can be within the European region as long as we have a work location.

This team operates within the GMT/CET time zone for collaboration.

Excluding France due to on-call restrictions.

Required profile