Key Facts

Remote From:

Canada , California (USA) , Massachusetts (USA) , New York (USA) , Washington (USA) , United States

Full time

Senior (5-10 years)

208 - 328K yearly

English

Hard Skills

GPU Optimization Product Management Electronic Hardware Design Leadership Digital Project Management Agent-Based Model Product Requirements Documents Software Architecture Software Release Life Cycle Machine Learning +4 more

Other Skills

•
Decision Making
•
Communication
•
Teamwork
•
Empathy

Roles & Responsibilities

12+ years demonstrated ability in product management at a technology company, co-founder or related technical role in a startup or equivalent experience.
Bachelor's Degree in Computer Science or related field (or equivalent experience).
Proven experience in AI inference, distributed systems, and GPU-accelerated computing.
Deep understanding of the LLM inference lifecycle (Prefill vs. Decode), KV cache mechanics, and distributed serving techniques, like Disaggregated Serving.

Requirements:

Drive the product strategy for Dynamo’s modular components, including the KV-aware Router, KV Block Manager (KVBM), and communication planes, and define the roadmap for high-scale LLM and Generative AI serving.
Define requirements for Inference Orchestration, including routing logic that minimizes redundant prefill and optimizes Time to First Token (TTFT) across substantial GPU clusters.
Define strategy for multi-tier KV cache offloading to enable long-context windows and high-concurrency serving without compromising user experience.
Collaborate with engineering on hardware-software co-design to maximize Dynamo performance on NVIDIA hardware; author product requirements documents (PRDs) and software design docs (SADDs); build for ease-of-use, extensibility, and modularity; align roadmaps with TPMs and market trends.

Job description

NVIDIA is seeking a highly technical Product Manager to own the evolution of NVIDIA Dynamo, our flagship distributed inference framework. In this role, you will define the roadmap for high-scale LLM and Generative AI serving, bridging the gap between cutting-edge hardware (Vera Rubin, LPU, and NVLink) and software optimizations, like disaggregated serving, KV aware routing, and intelligent KV cache management. We need a self-starter to continue growing the product portfolio and work with the customers to incorporate model evaluation into end-2-end LLM workflows. We're looking for the rare blend of technical and product skills and passion for groundbreaking technology. If this fits, we would love to learn more about you!

What you'll be doing:

Core Dynamo Architecture: Drive the product strategy for Dynamo’s modular components, including the KV-aware Router, KV Block Manager (KVBM), and communication planes.
Inference Orchestration: Define requirements for sophisticated routing logic that minimizes redundant prefill and optimizes Time to First Token (TTFT) across substantial GPU clusters.
Memory & KV Cache Management: Define strategy for multi-tier KV cache offloading enabling long-context windows and high-concurrency serving without compromising user experience.
Hardware-Software Co-Design: Collaborate with engineering to ensure Dynamo extracts maximum performance from NVIDIA hardware.
Agentic Inference: Develop Agent-first capabilities (e.g. priority, output length, cache pinning) to support sophisticated, multi-turn reasoning.
Ecosystem Integration: Partner with open-source communities, e.g. vLLM, SGLang, TensorRT-LLM, and internal teams (NeMo Agent Toolkit).
Product Leadership: Author product requirements documents (PRDs) and software application designs docs (SADDs). Build for ease-of-use, extensibility, modularity. Work with TPMs to align roadmaps and respond to market trends.

What we need to see:

12+ years demonstrated ability in product management at a technology company, co-founder or related technical role in a startup or equivalent experience.
Bachelors Degree in Computer Science or related field (or equivalent experience).
Proven experience in AI inference, distributed systems, and GPU-accelerated computing.
Deep understanding of the LLM inference lifecycle (Prefill vs. Decode), KV cache mechanics, and distributed serving techniques, like Disaggregated Serving.
Ability to translate low-level technical capabilities into high-level business value (reduced TCO, faster TTFT).
Teamwork and influencing skills to optimally navigate in a highly matrixed environment. At NVIDIA, your entire company is on your team!
Empathy and deep care for your customers to build products people love.
Pragmatic and data-driven project management skills to navigate software development lifecycle requirements, product release schedules, and customer desires and deliver quality software on schedule.

Ways to stand out from the crowd:

Proven track record working with Agentic frameworks (LangChain, NeMo Agents) or building multi-turn, stateful AI applications.
Knowledge of trends around LLMs and Generative AI, Responsible AI, MLOps
Technical background and hands-on experience building AI (and LLM) solutions as an engineer. We expect you to have intuition for ML models and systems evaluation and read relevant research papers to inform your product strategy and roadmap.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 208,000 USD - 327,750 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until April 13, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Ready to apply?

APPLY

Share ·