Role overview

Qualifications

Strong production experience in TypeScript or Go, with the ability to move between both languages; surface is TypeScript and core is Go.
Hands-on experience with AI coding agents and harnesses, familiarity with models such as Claude Code, Codex, OpenCode, Cursor, and the ability to diagnose why agents get stuck.
End-to-end feature ownership: design, build, ship, measure, and iterate features in a harness-focused environment.
Experience with long-horizon autonomous systems: spec-driven prompting, multi-session task execution, state persistence across sessions, and context management.

Responsibilities

Build and evolve the agent harness: ship hooks, permission gates, and context compaction; ensure constraints trace back to failures you diagnosed.
Own long-horizon execution: multi-session task completion via spec-driven prompting, worktree-per-slice git, Ralph loop recovery, and stuck detection; target improved completion rates.
Architect planner/executor/evaluator pipelines: planning with a reasoning model, execution with a fast one, evaluation with a third; no self-verification.
Manage agent memory and context: state persistence across sessions, context compaction, tool-call offloading; zero context rot on multi-day work.

Key facts

Remote from: Greece, Estonia, Latvia, Ukraine, Hungary, Bulgaria, Poland, Romania, Slovakia, Slovenia, Croatia
Full time
Senior (5-10 years)
Artificial Intelligence Engineer
English

Other skills

Communication
Time Management
Teamwork
Problem Solving

About the company

CAST AI

Developer Tools & DevOps Platforms

Increase your profit margin without additional work. CAST AI cuts your cloud bill in half, automates DevOps tasks, and prevents downtime in one Autonomous Kubernetes platform.

Company details

Company typeScaleup

IndustryDeveloper Tools & DevOps Platforms

Company size51 - 200

Links

Website LinkedIn See all jobs

Your match analysis

See how your profile stacks up against this role.

We compared the job requirements to your profile to show where you're strong and where you fall short.

Job description

Why Kimchi?

Kimchi is the AI platform inside CAST AI. We started by helping companies run LLMs on their own Kubernetes clusters - now we're building the execution layer where agents do real work.

Our Infrastructure today: multi-model inference (MiniMax, Kimi, GLM-5, Nemotron, DeepSeek) with intelligent routing, an OpenAI-compatible API, and deployment flexibility from our GPUs to your VPC. The inference layer is the foundation. What we're hiring for sits on top of it: coding agents, agent runtimes, orchestration systems, and the reliability engineering that makes them actually finish things.

Tech Stack: TypeScript, Go, Kubernetes, AWS/GCP/Azure, MCP, Prometheus/Grafana/Loki, GitLab CI, ArgoCD.

Why harness engineering matters here
OpenAI and Anthropic ship models. They also ship one harness each - the scaffolding that turns a raw model into something that can plan, execute, recover, and complete work. We ship a different kind of harness: one built for cost-conscious, long-horizon autonomy, running on inference infrastructure we control end-to-end.
A decent model with a great harness beats a great model with a bad harness. We've watched this play out. The gap between what today's models can do and what you see them doing is largely a harness gap - and that gap is where we operate.

What you'll build
The ratchet.
Every time our agent makes a mistake, we engineer a solution so it never makes that mistake again. That means hooks that enforce constraints the model "knows" but forgets: pre-commit lint checks, permission gates, context compaction before the window fills. Success is silent, failures are verbose.

Long-horizon execution.
Our harness is built around spec-driven autonomy: meta-prompting, fresh context per task, worktree-per-slice git strategy, automatic replanning, crash recovery, stuck detection. We're implementing Ralph loops - when the model tries to exit, we intercept and reinject the goal into a fresh context. The agent reads state from disk and continues. Multi-session, multi-day work, without context rot.

Planner/executor splits.
Planning with a reasoning model, executing with a fast one, evaluating with a third. Separating generation from evaluation beats self-verification because agents reliably skew positive when grading their own work.

The harness surface.
CLI, TUI, MCP integration, sandboxed execution, telemetry. Our AGENTS.md is short - every line traces to a specific thing that went wrong. TypeScript on the surface, Go where it matters.

Memory and context.
Moving agents off laptops, giving them state that survives across sessions, managing context so information lands where it's actionable. Compaction, tool-call offloading, progressive skill disclosure.

What makes this different (with receipts)
You've seen the pitch: "we route to the best model." Everyone says that. Here's what we actually have:

GPU infrastructure we own. Not just an API reseller. From GPU placement across clouds to the inference endpoint your agent calls - we control the cost curve.
A harness-first thesis. We treat agent failures as configuration problems, not model problems. When we moved from a stock harness to our own, completion rates on internal benchmarks improved by 40%+ on the same model.
Agents.md that earns every line. No brainstormed rules - every constraint in our system prompt traces to a real failure we saw and fixed.

Requirements:

You've used AI coding agents in anger. Not demos - real work. You have opinions about Claude Code, Codex, OpenCode, Cursor. You know what it feels like when an agent gets stuck and why.
Strong TypeScript or Go in production. Comfort moving between them. Our surface is TypeScript; our core is Go.
You think in harness terms. You read "the agent hallucinated" and your first instinct is to ask what context it was missing, what hook should have caught it, what constraint should exist.
You drive features end-to-end. Design → build → ship → measure → iterate. We don't have layers that absorb ambiguity for you.

Responsibilities:

Build and evolve the agent harness - ship hooks, permission gates, and context compaction. Every AGENTS.md constraint traces to a failure you personally diagnosed.
Own long-horizon execution - multi-session task completion via spec-driven prompting, worktree-per-slice git, Ralph loop recovery, and stuck detection. Completion rate is your metric.
Architect planner/executor/evaluator pipelines - planning with a reasoning model, execution with a fast one, evaluation with a third. No self-verification.
Manage agent memory and context - state persistence across sessions, context compaction, tool-call offloading. Zero context rot on multi-day work.
Own the harness surface - CLI, TUI, MCP integrations, sandboxed execution, telemetry. TypeScript on the surface, Go where it matters.

What success looks like (after 6 months):

You've shipped at least one major harness feature end-to-end: designed it, built it, measured it, iterated.
You've added constraints to our AGENTS.md based on failures you personally observed and diagnosed.
You've improved a measurable reliability metric - completion rate, context efficiency, or cost per successful task.
You've formed strong opinions about where our harness is load-bearing and where it's dead weight.

What’s in it for you?

Competitive salary (€6,500 - €9,000 gross, depending on the level of experience).
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Apply once. Then go straight to the hiring manager.

After you apply, unlock the direct contact details of the people who actually make the call. A quick follow-up makes you 5x more likely to land an interview.

Marcus Rivera

Chief Revenue Officer

m.rivera@company.com

linkedin.com/in/marcusrivera

Unlocked after you apply

Artificial Intelligence Engineer Related jobs

Greece Artificial Intelligence Engineer

Staff AI Engineer

2 days ago

TELUS Digital

Full time

Generative Artificial IntelligenceSoftware EngineeringPython (Programming Language)Django (Web Framework)Java (Programming Language)

Senior Software Engineer, AI Native Product Engineering

1 day ago

Xero

Full time

Amazon Web ServicesKubernetesDistributed ComputingC# (Programming Language).NET

Principal Engineer, AI Systems

2 days ago

Block

Full time

Artificial IntelligenceIntelligent AgentMachine LearningSoftware EngineeringSystem Optimization

AI Research Engineer

2 days ago

TELUS Digital

Full time

Python (Programming Language)Machine LearningNatural Language Processing (NLP)Transformer DesignData Management

Senior Software Engineer, AI-Native Platform (Remote)

2 days ago

ezCater

Full time

Back End (Software Engineering)Software EngineeringDistributed ComputingDatabase ManagementSystem Administration

Other jobs at CAST AI

Technical Account Manager | USA

Today

CAST AI

Full time
Mid-level (2-5 years)

KubernetesCloud ComputingCKANAmazon Web ServicesMicrosoft Azure

Senior Software Engineer

30+ days ago

CAST AI

Full time
Senior (5-10 years)

gRPCAWS Cloud ServicesGo (Programming Language)KubernetesArgo CD

Growth Performance Engineer

14 days ago

CAST AI

Full time

Media StrategySystem OptimizationCampaign ManagementGoogle AdsLinkedIn Recruiter

Senior AI Engineer - Harness Engineering (Kimchi)

Role overview

Qualifications

Responsibilities

Key facts

Hard skills

Other skills

About the company

Company details

Links

Your match analysis

Job description

Why Kimchi?

Requirements:

Responsibilities:

What’s in it for you?

Apply once. Then go straight to the hiring manager.

Artificial Intelligence Engineer Related jobs

Staff AI Engineer

Senior Software Engineer, AI Native Product Engineering

Principal Engineer, AI Systems

AI Research Engineer

Senior Software Engineer, AI-Native Platform (Remote)

Other jobs at CAST AI

Technical Account Manager | USA

Senior Software Engineer

Growth Performance Engineer

Reach out to the hiring manager directly.