Key Facts

Remote From:

Full time

Senior (5-10 years)

English

Hard Skills

Storage Architecture Distributed Computing Throughput Performance Systems Analysis Inference Engine High Performance Computing Systems Architecture Computer Data Storage Operational Efficiency Memory Management +11 more

Other Skills

•
Problem Solving

Roles & Responsibilities

Meaningful experience building or optimizing production AI systems, not just experimenting with models.
Deep hands-on experience close to the systems layer, such as improving workloads across GPU and CPU resources, reducing bottlenecks, or tuning infrastructure for throughput and latency.
Strong understanding of how inference performance is shaped by the interaction between compute, memory, storage, and serving architecture.
Proven ownership in areas like model serving, retrieval, caching, storage, or distributed performance, rather than purely application-layer AI work.

Requirements:

Build and optimize LLM serving and inference systems for production environments.
Improve performance across GPU and CPU pathways.
Design and scale systems that support RAG and retrieval-heavy AI workloads.
Work on KV cache, memory, storage, and throughput bottlenecks.

DDN Storage

Information Technology & Services

About DDN Storage

DDN is the world’s largest private data storage company and the leading provider of intelligent technology and infrastructure solutions for Enterprise At Scale, AI and analytics, HPC, government and academia customers. Through its DDN and Tintri divisions, the company delivers AI, Data Management software and hardware solutions, and unified analytics frameworks to solve complex business challenges for data-intensive, global organizations. DDN provides its enterprise customers with the most flexible, efficient and reliable data storage solutions for on-premises and multi-cloud environments at any scale. Over the last two decades, DDN has established itself as the data management provider of choice for over 11,000 enterprises, government, and public-sector customers, including many of the world’s leading financial services firms, life science organizations, manufacturing and energy companies, research facilities, and web and cloud service providers.

Company type: Scaleup

Industry: Information Technology & Services

Founded: 2018

Company size: 501 - 1000

LinkedIn See all jobs →

Job description

Overview:

Build the AI infrastructure layer that determines whether modern models actually work in production.

Most AI roles sit at the application layer. This one does not.

At DDN, we’re hiring an AI Engineer to work on the hard part of AI: the systems, storage, and performance infrastructure behind real-world model serving and inference. This is the role for engineers who care about what happens under load, at scale, and in production — not just in demos.

If your background sits at the intersection of AI infrastructure, distributed systems, and performance engineering, this is the kind of role where your depth will matter.

Job Description:

What you’ll do

Build and optimize LLM serving and inference systems for production environments
Improve performance across GPU and CPU pathways
Work on KV cache, memory, storage, and throughput bottlenecks
Design and scale systems that support RAG and retrieval-heavy AI workloads
Contribute to infrastructure where storage architecture and systems efficiency materially affect AI performance
Solve engineering problems at the intersection of AI, high-performance systems, and distributed infrastructure

What we’re looking for

An engineer who has spent meaningful time building or optimizing production AI systems, not just experimenting with models
Someone who understands how inference performance is shaped by the interaction between compute, memory, storage, and serving architecture
Deep hands-on experience working close to the systems layer — for example, improving how workloads run across GPU and CPU resources, reducing bottlenecks, or tuning infrastructure for better throughput and latency
Evidence of real ownership in areas like model serving, retrieval, caching, storage, or distributed performance, rather than purely application-layer AI work
The ability to move comfortably between architecture decisions and hands-on implementation, especially in environments where efficiency and scale matter
A background that suggests you can operate in technically demanding environments, whether that comes from AI infrastructure, high-performance systems, storage platforms, or adjacent distributed systems work
PhD preferred, but far less important than having built serious systems in the real world

Why this role is compelling

This is not a “prompt engineering” job.
This is not an “AI wrapper” job.
This is not a generic backend role with AI sprinkled on top.
This is a chance to work on the infrastructure that determines whether modern AI systems are fast, scalable, efficient, and commercially viable.
If you want to work on the real mechanics of AI performance — serving, retrieval, compute efficiency, memory behavior, storage architecture, and inference at scale — this is where that work happens.

Who will love this role

Engineers who enjoy deep systems problems
Builders who care about performance, scale, and architecture
People who want to work where AI meets infrastructure
Candidates who would rather solve hard technical bottlenecks than ship surface-level AI features

Who should not apply

This role is not for:

Purely academic researchers without meaningful production ownership
Generic software engineers without clear AI systems or inference depth
Candidates focused mainly on prompt engineering or lightweight application integrations
MLOps generalists who have not worked deeply on serving, storage, or performance-critical AI systems

Salary Range: $150,000 - $250,000

DDN:

Why DDN - DDN has deep credibility in high-performance infrastructure, and this role sits in a part of the market where that foundation matters. If you want to build the systems serious AI depends on — rather than the layer that merely sits on top of it — this is a rare opportunity to do exactly that.

Apply if you want to build the infrastructure behind production AI — not just consume it.

#Linkedin

Ready to apply?

APPLY