Logo for Lingaro

AI Ops Technical Leader

Roles & Responsibilities

  • Strong background in data engineering, AI/ML, or operations support technology with hands-on technical leadership in operations, IT, or service environments
  • Proven track record delivering production AI/ML/data solutions that reduce MTTR/MTTD, improve availability, and increase ticket deflection
  • Hands-on expertise with Python, Spark, Kafka, Airflow, cloud data platforms, PyTorch/TensorFlow, LLMs, and integrations with ServiceNow, PagerDuty, Splunk, Datadog, Moogsoft, Big Panda, Databricks, and Azure/ADF
  • Deep knowledge of AIOps practices including event correlation, anomaly detection, predictive analytics, automated actions, and GenAI for operations; plus experience designing or enhancing AIOps platforms

Requirements:

  • Lead and contribute to high-impact data and AI initiatives that improve operations support outcomes, including real-time incident enrichment, automated root-cause analysis, predictive alerting, ticket clustering and auto-triage, change risk scoring, knowledge mining, and intelligent runbooks
  • Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow, Jira Service Management, monitoring/observability tools, and ITSM systems
  • Lead architecture, development, and continuous improvement of internal AIOps platforms and reusable components; integrate AIOps with ITSM, observability, ticketing, and automation frameworks; apply MLOps best practices for production environments
  • Serve as the principal AI technical authority for operations support transformation programs across service operations, NOC, support desks, infrastructure operations, and reliability engineering; drive high-value AI use cases and vendor evaluations

Job description

As AI Ops Technical Leader, you drive the intelligent transformation of operations support. This player-coach role combines hands-on technical delivery, team leadership, and AI architecture governance to achieve operational excellence. You apply deep technical expertise and strategic leadership to design, build, and evolve AI and data solutions that improve incident management, major incident response, problem management, change enablement, service desk support, observability, and overall operational resilience.
 
What You'll Be Doing:

Hands-on Data & AI Solutions for Operations Support
·       Lead and contribute to high-impact data and AI initiatives that improve operations support outcomes, including real-time incident enrichment, automated root‑cause analysis, predictive alerting, ticket clustering and auto-triage, change risk scoring, knowledge mining, and intelligent runbooks.
·       Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow, Jira Service Management, monitoring/observability tools, and ITSM systems.
·       Ensure all solutions meet strict operational SLAs for reliability, low latency, auditability, explainability, and zero-downtime deployment.
·       Stay up to date with emerging AIOps tools, research, and trends, and apply them to enhance operations support.
 
AIOps Tools & Platform Leadership
·       Lead the architecture, development, and continuous improvement of internal AIOps platforms and reusable components supporting operations teams.
·       Integrate AIOps tools with ITSM systems, observability platforms (Prometheus, Grafana, ELK, Dynatrace, Splunk), ticketing systems, and automation frameworks.
·       Apply best practices in MLOps/AI Ops tailored to production environments: model monitoring, drift detection, automated rollback, performance checks, and cost optimization at scale.
 
AI Technical Leadership for Operations Support Initiatives
·       Serve as the principal AI technical authority for operations support transformation programs across service operations, NOC, support desks, infrastructure operations, and reliability engineering.
·       Lead technical discussions, architecture reviews, proof of concepts, vendor evaluations, and solution selection involving AI for operations.
·       Identify, prioritize, and drive high‑value AI use cases focused on reducing MTTR/MTTD, automating L1 triage, predicting major incidents, generating post‑mortems, optimizing shift handovers, and enabling proactive operations.
 
Team & People Leadership
·       Build, mentor, and lead a high-performing squad of AIOps specialists focused on measurable operations support improvements.
·       Foster a culture of experimentation, production‑first thinking, and commitment to operational impact—reduced toil, faster resolution, and higher availability.
·       Provide technical coaching, conduct design/code reviews, and guide career development with emphasis on operations and support domain expertise.
 
Stakeholder & Cross-Functional Collaboration
·       Work closely with operations support leaders, incident managers, service owners, reliability engineers, ITSM teams, infrastructure groups, and other stakeholders to align AI solutions with operational needs.
·       Collaborate deeply with DS&AI Competency teams to ensure high-quality, scalable, and sustainable AI delivery.
 
 
What We’re Looking For:
·       Strong background indata engineering, AI/ML, or operations support technology, including technical leadership in operations, IT, or service environments.
·       Proven track record delivering production AI/ML/data solutions that improve MTTR, MTTD, availability, and ticket deflection.
·       Hands-on expertise with Python, Spark, Kafka, Airflow, cloud data platforms, PyTorch/TensorFlow, LLMs, and integrations with tools like ServiceNow, PagerDuty, Splunk, Datadog, Moogsoft, Big Panda, Databricks, and Azure/ADF.
·       Deep knowledge of AIOps practices including event correlation, anomaly detection, predictive analytics, automated actions, and GenAI for operations.
·       Experience designing, building, or enhancing AIOps and internal tooling platforms.
·       Familiarity with ITIL processes (incident, problem, change, service request, knowledge management).
·       Experience with GenAI/LLM applications for operations such as copilots, auto-remediation, knowledge search, and alert/incident summarization.
·       Proven ability to scale AIOps in large operations or NOC environments while balancing hands-on work with strategy.
·       Strong communication skills, able to translate complex AI concepts for operations teams and executives, focusing on action and automation to reduce operational toil.
 

AI Operations (AI Ops) Engineer Related jobs

Other jobs at Lingaro

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.