Logo for ArangoDB

Sr. AI Infrastructure Engineer

Roles & Responsibilities

  • 5+ years of senior-level software engineering experience, with hands-on work on AI infrastructure and production systems
  • Strong debugging skills in complex, distributed systems and ability to identify root causes in ambiguous failures
  • Proficiency with Python (5+ years) and experience operating microservices, Docker, Kubernetes, and Helm (5+ years)
  • Experience with CI/CD pipelines (e.g., CircleCI) and observability, including metrics, logs, tracing, and profiling

Requirements:

  • Maintain, develop, and stabilize AI infrastructure services
  • Architect and implement foundational services and shared libraries that scale the AI infrastructure ecosystem
  • Debug complex, non-obvious production issues across application, process, network, and memory layers; isolate and resolve failures
  • Improve observability, profiling, and tracing across services; operate systems running on Docker, Kubernetes, and Helm; support CI/CD pipelines and testing

Job description

Senior Software Engineer – AI Infrastructure

 

About Arango

At Arango, we believe the first generation of enterprise AI missed something essential: context. LLM models are powerful, but they didn’t understand the context needed to deliver accurate answers.

Arango provides a trusted data foundation for the next wave of Enterprise AI with graph-based Contextual AI - transforming enterprise data into a System of Context that truly represents the business, so LLMs can deliver better outcomes with unlimited scale and cost efficiency.

The Arango AI Data Platform gives developers a single, integrated environment to build and scale AI-powered applications without the complexity of stitching together multiple databases and tools. At its core is a massively scalable multi-model database that unifies graph, vector, document, and key-value data with full-text, geospatial, and vector search — creating the System of Context, the bridge between enterprise data and LLMs.

We’re a global team based in California and Cologne, united by curiosity, collaboration, and a passion for helping developers, data engineers, and technology leaders innovate faster and smarter with AI. Trusted by NVIDIA, HPE, the London Stock Exchange, the U.S. Air Force, NIH, and Articul8, Arango powers enterprise AI with context, confidence, and scale. We are a proud member of the NVIDIA Inception Program and the AWS ISV Accelerate Program. 

If you’re excited about shaping the future of Contextual AI, come build with us.

 

Location

** Only candidates in Europe will be considered. **

 

About the Role

We are looking for a senior, hands-on software engineer to help maintain, stabilize, and debug our AI infrastructure. This role has high ownership and requires deep technical problem-solving skills in production environments. The most important qualification is not a specific language, but a strong understanding of how complex software systems actually behave in production.

 

Key Responsibilities:

  • Maintain, develop and stabilize AI infrastructure services
  • Architect and implement foundational services and shared libraries that scale our entire AI infrastructure ecosystem.
  • Debug complex, non-obvious production issues across application, process, network, and memory layers
  • Systematically analyze, isolate, and resolve failures in unclear scenarios
  • Improve observability, profiling, and tracing across services
  • Work with distributed microservices and internal platforms
  • Operate and improve systems running on Docker, Kubernetes, and Helm
  • Support CI/CD pipelines (CircleCI), testing, and security-relevant components
  • Work with MLflow, Triton, and distributed Python modules
 

Core Requirements:

  • Senior-level experience (minimum 5+ years)
  • Passion for working with cutting-edge technologies in a fast-moving AI environment, including LLM-based workflows and pipelines
  • Comfortable working independently in a fast-paced, evolving environment
  • Strong debugging skills in complex, distributed systems with the ability to identify root causes when failures are ambiguous
  • Solid understanding of how software behaves in production environments
  • Experience designing and operating microservices, distributed systems, and databases
  • Proven experience building and scaling high-availability services
     

Core Technology Stack

  • Python (primary language) - 5+ years
  • Docker, Kubernetes, Helm - 5+ years
  • CI/CD pipelines and testing frameworks (e.g., CircleCI)
  • Observability tools: metrics, logs, tracing, and profiling
  • Exposure to AI/ML infrastructure, including tools such as MLflow and Triton

Nice to Have:

  • Customer-facing experience, including proofs of concept (PoCs) or technical demos
  • Experience with AI/ML infrastructure and orchestration, i.e., MLflow and Triton
  • Cross-language debugging experience
  • Familiarity with Rust
  • Experience working with databases, NoSQL, multi-model, or graph databases
  • Knowledge of Retrieval-Augmented Generation (RAG) and GraphRAG concepts
  • Understanding of graph algorithms and graph-based data modeling
 

Why Join ArangoDB

Our headquarters is in San Francisco (US) and we have an office in Cologne (Germany), but most of our diverse team works remotely worldwide. So, do you prefer your desk at home or do you want to join us at one of our locations? Your choice.

The global minds of Arango team comes from 5 different continents and more than 20 countries. Diverse backgrounds enable us to see new solutions. We invite people from every culture, national origin, religion, sexual orientation, gender identity or expression, and of every age to apply to our positions. All employment decisions are based on business needs, job requirements, and individual qualifications. Arango is committed to a workplace free of discrimination and harassment based on any of these characteristics. We love this diversity and encourage everyone curious and visionary to join the multi-model movement.



 

Infrastructure Engineer Related jobs

Other jobs at ArangoDB

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.