Logo for Astreya

AI / ML Engineer II

Roles & Responsibilities

  • 4+ years of experience in IT services, with at least 3 years in IT Service Automation – Orchestration, Scripting, and Process Assessment.
  • AI Engineering – Developing and deploying AI agents and cutting-edge GenAI/ML solutions to address complex business challenges.
  • Proficiency in at least two scripting languages (PowerShell, Python, or Shell Script) and containerization technologies (Docker, Kubernetes) for deploying AI models.
  • Strong understanding of ITIL processes with practical experience in monitoring, incident management, change management, and maintenance.

Requirements:

  • Contribute toward AI roadmap for AI-driven automation aligning with strategic company and client goals.
  • Collaborate with engineering and data teams to design and architect scalable, robust, and innovative AI solutions (e.g., automated network diagnostics, bot recommendation systems, AI agents NOC operations).
  • Act as the Solution Owner in an Agile/Scrum environment, managing the product backlog, writing detailed user stories, defining acceptance criteria, and prioritizing features.
  • Lead the evaluation, fine-tuning, and integration of open-source Large Language Models (LLMs) like Llama series and other LLMs.

Job description

Job Description

Responsibilities:

  • Contribute toward AI roadmap for AI-driven automation aligning with strategic company and client goals.

  • Collaborate with engineering and data teams to design and architect scalable, robust, and innovative AI solutions (e.g., automated network diagnostics, bot recommendation systems, AI agents NOC operations).

  • Act as the Solution Owner in an Agile/Scrum environment, managing the product backlog, writing detailed user stories, defining acceptance criteria, and prioritizing features.

  • Lead the evaluation, fine-tuning, and integration of open-source Large Language Models (LLMs) like Llama series & other LLMs.

  • Develop and own key components of the automation framework, including PowerShell and Python runbook executors.

  • Develop and train machine learning models for tasks such as anomaly detection, predictive maintenance, and capacity planning in IT environments.

  • Work with large datasets of IT operational data, performing data cleaning, feature engineering, and data analysis to improve model accuracy and performance.

  • Contribute to the development of internal automation tools and frameworks.

  • Deliver continuous service improvements by proactively identifying opportunities for process enhancements.

  • Proficiently develop AI/Gen AI point solutions to meet business needs.

  • Troubleshoot and resolve issues related to automation systems and AI models.

  • Engage with customers, IT operators, Network Engineers, and internal stakeholders to gather requirements, validate solutions, and ensure product-market fit.

  • Serve as the Subject Matter Expert (SME) on AIOps, IT Process Automation (ITPA), and Runbook Automation (RBA), providing technical guidance and insights.

  • Document automation processes, code, and models clearly and concisely.

  • Actively contribute to team results and work towards achieving team goals and objectives.

  • Undertake designated skill/knowledge development within the organization, including training of next-level team members.

Required Skills & Competencies:

  • 4+ years of experience in IT services, with at least 3 years in:

    • IT Service Automation – Orchestration, Scripting, & Process Assessment.

    • AI Engineering – Developing and deploying AI Agents and cutting-edge GenAI & ML solutions to address complex business challenges.

  • Expertise in integrating various tools and building analytics/insights.

  • Proficiency in at least two scripting languages such as PowerShell, Python, or Shell Script.

  • Strong foundational knowledge across core IT domains, with a specific emphasis on Network & Network Services, including understanding network topologies, protocols, and common operational issues.

  • Demonstrable experience implementing solutions using state-of-the-art LLMs, with hands-on experience with both open-source models (e.g., Llama series) and proprietary models (e.g., GPT-4).

  • Proficiency with a major deep learning framework, preferably PyTorch, for model experimentation and fine-tuning.

  • Proficiency in a high-level programming language (Java, Python) and experience with containerization technologies (Docker, Kubernetes) for deploying AI models.

  • Solid understanding and practical experience with cloud platforms such as Azure or GCP, including knowledge of their networking services (e.g., VNets, VPCs).

  • Experience with AI/ML libraries and frameworks.

  • A strong understanding of ITIL process on one or more service lifecycle or service capability modules.

  • Knowledge in one or more System administrative activities like monitoring, service requests, incident management, change management, & maintenance.

  • Excellent communication skills with an ability to explain concepts and solutions clearly and concisely.

  • Experience in training/grooming people.

  • Excellent analytical and problem-solving skills.

Machine Learning Engineer Related jobs

Other jobs at Astreya

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.