Offer summary

Qualifications:

5+ years in software engineering, with 3+ years architecting large-scale backend systems., 4+ years designing, deploying and monitoring AI/ML systems in production., Deep expertise in large-language-model serving, MoE routing, or similar AI/ML technologies., Hands-on experience with Kubernetes, Docker, CI/CD, and distributed data technologies..

Key responsibilities:

Design and implement end-to-end pipelines for model-powered applications.

Own versioning, lineage, and policy gating of models and tools.

Collaborate with product managers to translate customer needs into technical solutions.

Mentor a cross-functional team of engineers and data scientists.

Job description

As AI Expert, you will be the technical owner of everything model-powered inside the company:

Architecture – Design the end-to-end pipeline that ingests org context, routes to the right expert model, executes code in sandboxed containers, and feeds rich telemetry back into our continuous-learning loop.
Model Strategy – Decide when we fine-tune open-source Llama-3 vs. hot-swap to Bedrock or Vertex; benchmark MoE routers for latency and cost; champion vLLM/Triton for GPU efficiency.
MLOps at Scale – Own versioning, lineage, policy gating and roll-back of models and in-line tools. Ship deterministic, reproducible releases that DevSecOps trusts.
Tooling & Integrations – Work with backend and platform leads to expose new model endpoints through our Model Context Protocol (MCP) so agents can compose actions across GitHub, Jira, Terraform, Prometheus and more — without one-off plugins.
Thought Leadership – Partner with the CTO on the technical roadmap, publish internal RFCs, mentor engineers and evangelize best practices across the company and open-source community.

What You’ll Do

Craft cloud-native, micro-service architectures for training, fine-tuning and real-time inference (AWS/GCP/Azure, Kubernetes, JetStream).
Define SLOs for p95 agent latency, model success rate, and telemetry coverage; instrument with OTEL, Prometheus and custom reward models.
Drive our continuous-learning loop: reward modelling, ContextGraph enrichment, auto-tuning MoE routers.
Embed least-privilege IAM and OPA/ABAC policy checks into every stage of the model lifecycle.
Collaborate with product managers to translate customer pain into roadmap items and with design partners to validate solutions in production.
Mentor a cross-functional squad of backend engineers, ML engineers and data scientists.

What You'll Bring

5+ years in software engineering, with 3+ years architecting large-scale backend systems (Python, Go, Java or similar).
4+ years designing, deploying and monitoring AI/ML systems in production.
Deep expertise in at least one of: large-language-model serving, MoE routing, RLHF, vector search, streaming inference.
Hands-on fluency with Kubernetes, Docker, CI/CD, IaC (Terraform/Helm) and distributed data technologies (Kafka, Spark, Arrow).
Proven MLOps track record (MLflow, Kubeflow, SageMaker, or similar) and a security-first mindset.
Ability to turn ambiguous business goals into a crisp, scalable architecture — and to communicate that vision to both executives and engineers.
Great Englich communication skills.

Nice-to-Haves

PhD or publications in ML/NLP/Systems.
Contributions to open-source LLM or MLOps projects.
Experience pushing real-time inference to the edge or FPGA/ASIC accelerators.
Prior leadership of cross-functional AI/ML teams in a fast-growing startup environment.

The Way We Work

We value clarity, ownership, and velocity. You’ll have direct access to the CTO, autonomy to choose the right tech, and a front-row seat as we redefine how enterprises move “from prompt to production.” If building the Kubernetes of AI-driven operations excites you, let’s talk.

Required profile