Key Facts

Remote From:

Full time

Senior (5-10 years)

English

Hard Skills

GPU Optimization Large Language Modeling Multi-Agent Systems Inference Engine Debugging C++ (Programming Language) Linux Python (Programming Language) Proof Of Concept (POC) Development Multi-Agent Systems +11 more

Other Skills

•
Collaboration
•
Communication
•
Teamwork
•
Social Skills
•
Problem Solving

Roles & Responsibilities

BS/MS/PhD in Computer Science, Electrical Engineering, AI/ML, or equivalent experience
5+ years of experience in deep learning, machine learning, or distributed AI systems
Strong programming and debugging experience in Python, C/C++, and Linux environments
Hands-on experience building LLM and generative AI applications

Requirements:

Localize the future: Fine-tune LLMs to speak the authentic language of specific regions and industries
Develop and optimize training and inference workflows with partners and collaborate with internal NVIDIA development teams to improve our software stack
Build sophisticated agentic systems featuring multi-agent coordination, long-horizon reasoning, and sophisticated planning frameworks
Develop full-scale solutions, including domain-specific enterprise agents and high-performance retrieval pipelines (RAG) spanning various data sources

Job description

Join NVIDIA as a Solutions Architect to help LATAM build culturally-nuanced LLMs and empower local developers to build and deploy next-generation agentic AI applications. Collaborate with premier startups, research labs and ISVs to develop the next generation components of the AI-native systems. By mastering NVIDIA’s core technologies—NIM, NeMo Framework, Dynamo, and Nemo Agent Toolkit—you will guide partners through the complexities of performance optimization and production-grade deployment. As a trusted advisor, you’ll transform raw LLM capabilities into high-performance, industry-focused enterprise agents. At NVIDIA, we work as a unified front. You will collaborate daily with our Account Managers, DevRel leads, and Marketing experts to turn bold AI visions into regional realities.

What you'll be doing:

Localize the future: Fine-tune LLMs to speak the authentic language of specific regions and industries.
Develop and optimize training and inference workflows with partners and collaborate with internal NVIDIA development teams to improve our software stack
Build sophisticated agentic systems featuring multi-agent coordination, long-horizon reasoning, and sophisticated planning frameworks.
Develop full-scale solutions, including domain-specific enterprise agents and high-performance retrieval pipelines (RAG) spanning various data sources.
Optimize inference performance by bringing to bear GPU-accelerated frameworks and the full NVIDIA AI infrastructure stack.
Build hands-on PoCs and reference architectures that serve as the blueprint for production-grade generative AI pipelines.
Partner with high-growth startups and Enterprise ISVs to embed NVIDIA’s software stack into their core platforms, slashing the time to market for production-grade AI.
Fuel partner innovation through hands-on developer enablement and thorough architectural reviews, turning sophisticated AI visions into production realities.
Scale global expertise by crafting reusable assets and documentation that help field teams deploy agentic AI at scale.

What we need to see:

BS/MS/PhD in Computer Science, Electrical Engineering, AI/ML, or equivalent experience.
5+ years of experience in deep learning, machine learning, or distributed AI systems.
Strong programming and debugging experience in Python, C/C++, and Linux environments.
Background in using deep learning libraries like PyTorch or TensorFlow.
Hands-on experience building LLM and generative AI applications.
Experience working with agentic or multi-agent AI systems employing frameworks such as: LangGraph, LlamaIndex, CrewAI, LangChain, or OpenAI Agents SDK or similar orchestration frameworks
Experience building tool-using AI agents that interact with APIs, databases, and enterprise systems.
Ability to rapidly prototype AI applications and build scalable GPU-accelerated architectures.
Excellent interpersonal skills and the ability to collaborate with engineering teams, partners, and executive collaborators.

Ways to Stand Out from the Crowd:

Experience working with NVIDIA GPUs and AI software, such as NVIDIA NIM, NeMo Framework, NeMo Retriever, and NeMo Agent Toolkit.
Experience with LLM evaluation frameworks, benchmarking systems, and safety guardrails for agentic workflows.
Experience optimizing reasoning-focused LLMs through timely engineering, quantization, or benchmarking.
Familiarity with Kubernetes/OpenShift, CI/CD automation, and cloud-native deployment patterns for AI systems.
Experience with parallel or distributed computing environments and AI workloads optimized for GPUs.