Job description

Description

NeuReality is seeking a Lead System Architect to join our system architecture team and help define NR-NEXUS, our next-generation AI inference platform.

Responsibilities

Lead the software architecture and technical roadmap for NeuReality’s NR-Nexus
Write system specifications for NR-Nexus product
Research AI infrastructure, SaaS platforms, model serving, and inference trends
Work with engineering to translate technical capabilities into product value
Work closely with engineering teams to optimize performance, scalability, and feature delivery.
Define performance goals and lead profiling, benchmarking, and optimization efforts for GenAI and distributed AI workloads.
Collaborate with customers, partners, and open-source communities to ensure ecosystem compatibility and adoption.
Mentor software engineers and provide technical leadership

Requirements

7+ years of software engineering experience, including 3+ years in software architecture or technical leadership.
Strong experience with Kubernetes-based platforms and cloud-native architecture.
Deep understanding of Gen AI/LLM infrastructure and distributed workloads
Experience designing management software or SaaS platforms for production systems.
Strong background in distributed systems, microservices, APIs, and automation.
Hands-on experience with observability stacks, monitoring, logging, alerting, and SLA/SLO tracking.
Experience with CI/CD, deployment automation, upgrades, and rollback mechanisms.
Good understanding of security, authentication, authorization, and integration with customer data center environments.

Nice to have

Deep understanding of GenAI / LLM inference infrastructure, including model serving, scaling, batching, latency, throughput, and resource utilization.
Experience with production AI inference clusters using GPUs, AI accelerators, or other specialized compute infrastructure.
Understanding of how distributed inference systems operate, including scheduling, load balancing, autoscaling, failover, and cluster-level observability.
Experience with LLM serving frameworks such as vLLM, Triton Inference Server, TensorRT-LLM, or similar.
Familiarity with GPU/accelerator orchestration, device plugins, resource scheduling, and cluster capacity planning.
Familiarity with GPU communication technologies such as GPUDirect RDMA, NCCL, NVLink, or UALink.
Experience optimizing communication for distributed AI/ML workloads.
Knowledge of Prometheus, Grafana, OpenTelemetry, Helm, Argo CD, Istio, KServe, Kubeflow, or similar tools.
Experience deploying software in on-prem, edge, private cloud, or hybrid environments.

Lead SW Architect

Role overview

Qualifications

Responsibilities

Key facts

Hard skills

Other skills

About the company

Company details

Links

Your match analysis

Job description

Description

Requirements

Apply once. Then go straight to the hiring manager.

AWS Architect Related jobs

Principal Architect

Specialist GenAI Architect

Lead Architect

Senior AWS Architect

Cloud-Native Architect

Other jobs at NeuReality

Director of Sales –AI SaaS

Reach out to the hiring manager directly.