Key Facts

Remote From:

Anywhere

Full time

Senior (5-10 years)

English

Hard Skills

Other Skills

•
Calmness Under Pressure
•
Collaboration
•
Communication
•
Teamwork

Netomi

About Netomi

Netomi helps companies deliver higher quality customer experiences at scale with AI. If customer service is not a priority, you don't have to read any further. Warner Bros, Westjet, and HP trust Netomi's AI-first customer service platform to deliver eye-popping results. Doing this all while significantly reducing cost. Netomi is building the first Relationship Operating System and its three main benefits for companies are: - Industry-leading resolution rates (automatically resolves 80% of routine customer service inquiries) - Improved resolution time - Increased customer satisfaction and support quality People love our patented, no-code platform, which works across messaging, chat, email and voice. The platform also understands 100+ languages - ¡Qué bueno! Netomi is based in San Francisco and has offices in Toronto, New York, and India. Want to work from your home office? We have remote options as well! We can't hire fast enough, join this incredible team today!

Company type: Scaleup

Founded: 2018

Company size: 51 - 200

Website LinkedIn See all jobs →

Job description

About the Company:

Netomi is the leading agentic AI platform for enterprise customer experience. We work with the largest global brands like Delta Airlines, MetLife, MGM, United, and others to enable agentic automation at scale across the entire customer journey. Our no-code platform delivers the fastest time to market, lowest total cost of ownership, and simple, scalable management of AI agents for any CX use case. Backed by WndrCo, Y Combinator, and Index Ventures, we help enterprises drive efficiency, lower costs, and deliver higher quality customer experiences.

Want to be part of the AI revolution and transform how the world’s largest global brands do business? Join us!

About the role

We are looking for a proactive Incident Manager to own end-to-end incident response across our AI and platform stack. You will ensure rapid detection, triage, communication, and resolution of incidents impacting customers and internal systems.

Responsibilities

Own the incident lifecycle: detection, triage, escalation, resolution, and postmortems

Act as the central command during major incidents (war rooms, stakeholder updates)

Define and enforce SLAs/SLOs, incident severity frameworks, and runbooks

Collaborate with Engineering, ML, and Integrations teams to resolve issues quickly

Monitor system health across integrations (agent desks, LLMs, ASR/TTS pipelines)

Drive root cause analysis (RCA) and preventive actions

Improve observability, alerting, and incident tooling

Maintain clear internal and customer-facing communication during incidents

Requirements

3–6 years in Incident Management / SRE / Production Support roles

Strong understanding of distributed systems, APIs, and cloud environments (AWS)

Experience with observability tools (e.g., DataDog)

Familiarity with AI/ML systems, especially LLM integrations and voice stacks (ASR/TTS), is a plus

Experience with monitoring/tracing tools like Langfuse or similar

Excellent communication and stakeholder management skills

Ability to stay calm under pressure and drive structured resolution

Nice to Have

Exposure to OpenAI or similar LLM platforms

Experience supporting customer-facing SaaS products

Automation mindset (runbooks, alert tuning, incident tooling)

Netomi is an equal opportunity employer committed to diversity in the workplace. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, disability, veteran status, and other protected characteristics.

Ready to apply?

APPLY

Share ·