Match score not available

ML Engineer with Audio Processing Expertise (Speech-to-Speech Focus) at Vosyn

Remote: 
Full Remote
Contract: 
Salary: 
10 - 480K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years of experience in machine learning development focused on audio generation and TTS., Strong proficiency in Python and machine learning frameworks such as PyTorch., Deep expertise in audio signal processing and TTS models., Demonstrated ability to provide technical leadership..

Key responsabilities:

  • Provide expert-level advice on TTS model development.
  • Guide implementation of testing methodologies for TTS models.
  • Lead latency optimization initiatives for real-time speech conversion.
  • Mentor team on advanced deep learning models for audio processing.

Vosyn logo
Vosyn Scaleup http://www.vosyn.ai/
201 - 500 Employees
See all jobs

Job description

Job Title: ML Engineer with Audio Processing Expertise (Speech-to-Speech Focus)

Level: Senior SME

Department: Software Development

Status: Contract (10-15 hours/week)

Work location: Fully Remote

Compensation: Hourly ($250)

Company Overview: At Vosyn, we embrace the exciting, game-changing world of Artificial Intelligence, driving innovation and pioneering impactful projects across various industries. Our incubator, AI Venture Lab, nestled in the heart of Office146.com, is a crucible of entrepreneurial spirit, supported by intelligent processes and industry-leading best practices. We believe in fostering a culture of flexibility, continuous improvement, and solution-focused strategies. Here, every idea is welcomed, nurtured, and has the potential to scale to new heights. Currently, we're at the forefront of a significant IPO endeavor, a truly unicorn in the making. We invite you to be part of our journey and leave your imprint on the future of AI. At Vosyn, you will have the opportunity to engage with a fast-growing global organization with diversity of thought, experience, and cultures.

About the Role: We are seeking an experienced ML Engineer SME to provide strategic guidance and technical leadership on key components of our end-to-end speech-to-speech (S2S) pipeline. As a senior project advisor, you will collaborate with the VosynCore team, identifying solutions to complex challenges, particularly in text-to-speech (TTS) model development and optimization. Your expertise will be crucial in driving project progress and ensuring our S2S pipeline meets or exceeds industry standards for quality and performance.

Key Responsibilities:

  • Provide expert-level advice and mentorship on the architecture, training, and production of text-to-speech (TTS) models
  • Guide the implementation of robust testing methodologies for TTS models using industry standards like MOS testing
  • Share expertise in distributed training, monitoring, and deployment of large-scale ML models on cloud platforms
  • Lead latency optimization initiatives in real-time systems for high-quality speech-to-speech conversion
  • Provide guidance on tuning TTS models for precise control over speech characteristics
  • Share in-depth knowledge of various TTS model architectures and waveform generation methods
  • Mentor the team on implementing advanced deep learning models for audio processing
  • Guide the development of transformer architectures for complex TTS model development

Required Qualifications:

  • 5+ years of proven experience in machine learning development focused on audio generation and TTS systems
  • Extensive expertise in audio signal processing, particularly for human voices
  • Deep experience with TTS models, including waveform generation and spectrogram-based methods
  • Proven expertise in tuning TTS models for duration control and speech characteristics
  • Strong proficiency in Python and machine learning frameworks such as PyTorch
  • Experience with advanced deep learning models like WaveNet and transformer-based architectures
  • Demonstrated experience in distributed training and deployment of ML models on cloud platforms
  • Strong understanding of evaluation metrics for TTS systems
  • Proven ability to provide technical leadership and actionable guidance
  • Excellent communication and mentoring skills

Preferred Qualifications:

  • Experience with real-time audio processing systems
  • Background in speech synthesis research
  • Knowledge of multiple languages and accents in TTS
  • Experience with ML model optimization techniques
  • Publications or patents in related fields

Additional Perks:

  • Be part of the invigorating journey of a pre-seed AI startup in stealth mode
  • Engage directly with senior management and strategic advisory board members
  • Gain valuable experience in the bleeding-edge AI space
  • Remote-first culture with flexible working arrangements

DEI and Workplace Safety: At Vosyn Inc., we are committed to fostering a diverse, equitable, and inclusive workplace where every employee feels valued and supported. We believe that diversity of thought, background, and experience enriches our company culture and enhances innovation. We are an equal-opportunity employer and encourage candidates from all walks of life to apply. As part of our commitment to creating a safe and healthy work environment, we prioritize workplace safety, adhering to all relevant regulations and promoting a culture of responsibility. We believe that a safe and inclusive workplace is essential for the well-being and success of our team members.

Recruitment Process:

  1. Initial Screening: Review of application and preliminary assessment
  2. Video Interview: In-depth discussion of ML expertise and role expectations
  3. Technical Panel Interview: Deep dive into audio processing experience and ML architecture approach
  4. Final Selection: Assessment of overall fit and alignment
  5. Offer and Onboarding: Equity compensation details and onboarding process.

Join a dynamic global organization that champions diversity in thought, experience, and culture. Our team is composed of top experts from around the world. We invite you to leverage your expertise, mentor future leaders, and thrive with us in this exciting journey.

Apply Now: Vosyn Careers




Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Communication
  • Mentorship

Related jobs