Match score not available

Senior Deep Learning Performance Engineer

extra holidays - fully flexible
Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years in deep learning model implementation, BSc, MS, or PhD in a technical field, Proficiency in Python and major frameworks, Strong problem-solving and analytical skills, Experience with deep learning compilers.

Key responsabilities:

  • Profile, analyze, and optimize performance of deep learning models
  • Develop tooling for profiling DSP workloads
  • Collaborate to provide performance insights and recommendations
  • Own development of methodologies for high-performance models
  • Conduct benchmarking on GPU clusters and pre-release hardware
NVIDIA logo
NVIDIA XLarge http://www.nvidia.com/
10001 Employees
See more NVIDIA offers

Job description

We are seeking senior engineers with a passion for performance analysis and optimization to join our team in advancing ground breaking technologies for deep learning compilers and automated kernel generation. At NVIDIA, you will collaborate across the full hardware/software stack—from GPU architecture to deep learning frameworks—to push the boundaries of AI performance. This role provides an outstanding opportunity to craft both hardware and software roadmaps at a company that is at the forefront of the AI revolution. You will work alongside world-class engineers to implement innovative deep learning models and optimize end-to-end performance for NVIDIA’s DL software and hardware ecosystem. You'll have the chance to work on powerful, enterprise-grade GPU clusters delivering hundreds of PetaFLOPS, and gain access to unreleased hardware that is shaping the future of AI.

What you’ll be doing:

  • Profile, analyze, and optimize the performance of deep learning models and workloads on ground breaking hardware and software platforms.

  • Develop tooling for profiling and microbenchmarking of DL workloads running compiled models uncovering optimization opportunities.

  • Collaborate with teams across NVIDIA to provide performance insights and recommendations that improve the design and efficiency of DL frameworks and workloads.

  • Own the development and implementation of standard methodologies for compiling, testing, and deploying high-performance deep learning models.

  • Conduct performance benchmarking on enterprise-grade GPU clusters and pre-release hardware, driving improvements to NVIDIA’s DL software stack and hardware roadmap.

What we need to see:

  • 5+ years of experience in deep learning model implementation, software development, and performance optimization.

  • BSc, MS, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, Physics, or a related technical field, or equivalent practical experience.

  • Proficiency in Python, with extensive hands-on experience using at least one major deep learning framework (e.g., PyTorch, TensorFlow, JAX).

  • Strong problem-solving and analytical skills, with a proven track record in debugging, performance tuning, and workload optimization.

  • Experience with deep learning compilers (e.g., PyTorch’s torch.compile, XLA, or other similar technologies)

Ways to stand out from the crowd:

  • Experience with running large-scale workloads in HPC clusters

  • Knowledge and passion for DevOps/MLOps practices for Deep Learning-based product’s development.

  • Solid understanding of Linux environments and containerization technologies such as Docker

  • Familiarity with GPU programming or parallel computing.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most hard-working and forward-thinking people in the world working for us. If you're creative and autonomous, we want to hear from you! We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

#deeplearning

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Problem Solving
  • Analytical Skills

Deep Learning Engineer Related jobs