Match score not available

Senior MLOps Engineer

84% Flex
UNLIMITED HOLIDAYS - WORK FROM ANYWHERE - FULLY FLEXIBLE
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

4+ years DevOps experience, 1+ year managing AI/ML cloud infrastructure, Experience with AWS, Python, Docker, Kubernetes, Knowledge of CI/CD tools like Jenkins, ArgoCD, Familiarity with Logging and Monitoring tools.

Key responsabilities:

  • Design, build, maintain scalable AI/ML infrastructure
  • Develop automate CI/CD pipelines for deployment
  • Evaluate new tools to enhance MLOps processes
  • Manage Continuous Deployment using Kubernetes, ArgoCD, Jenkins
  • Monitor and troubleshoot AI/ML systems for performance
Tala logo
Tala Fintech: Finance + Technology Large https://tala.co/
501 - 1000 Employees
HQ: Los Angeles
See more Tala offers

Job description

Logo Jobgether

Your missions

The Role
We are currently seeking a Senior Cloud Infrastructure Engineer with experience in MLOPs to design, implement, and maintain suitable infrastructure and best deployment practices of ML Pipelines and models. You will bring Machine Learning, AI infrastructure, and automation expertise with the knowledge of AWS cloud infrastructure and DevOps practices.

What You'll Do
  • Design, build, and maintain scalable and robust infrastructure for AI/ML (Artificial Intelligence / Machine Learning) systems, including cloud-based environments, containerization, and orchestration platforms
  • Develop and implement CI/CD pipelines to automate the deployment, testing, and monitoring of AI/ML models and applications
  • Evaluate and integrate new tools, technologies, and frameworks to improve the efficiency and effectiveness of our MLOps processes
  • Design and manage Continuous deployment using Kubernetes, ArgoCD, and Jenkins
  • Maintain related container registry and model registry.
  • Monitor infrastructure utilization and costs pertaining to model training, inference, and GPU utilization
  • Monitor and troubleshoot AI/ML systems to ensure high availability, performance, and reliability

  • What You'll Need
  • 4+ years of experience as a DevOps Engineer
  • 1 year of previous experience managing AI/ML infrastructure in public cloud environments
  • In-depth hands-on experience with at least one public cloud platform, preferably AWS
  • Experience with Python or any other programming language
  • Experience with Docker and Kubernetes in production
  • Experience with Continuous Deployment tools such as Jenkins or ArgoCD
  • Experience with Logging and Monitoring tools for SaaS such as Sumo, Splunk, Datadog, etc
  • Proficiency in English
  • Required profile

    Experience

    Level of experience: Mid-level (2-5 years)
    Industry :
    Fintech: Finance + Technology
    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Machine Learning Engineer Related jobs