Logo for Georgia IT, Inc.

Machine Learning Engineer/SRE-100% Remote

Roles & Responsibilities

  • Azure infrastructure experience (VMs, storage, networking) to support AI model development and deployment
  • CI/CD pipeline experience with automation of model deployment processes
  • Containerization in the cloud (Docker and Kubernetes) for efficient deployment and scaling of ML models
  • Machine Learning expertise: building and optimizing models with various algorithms and frameworks; programming skills in Python (TensorFlow, PyTorch)

Requirements:

  • Manage Azure Infrastructure: Configure, maintain, and optimize Azure infrastructure for AI model development and deployment, ensuring scalability and performance
  • Model Performance Monitoring: Implement and maintain monitoring systems to track model performance, proactively identify issues and address them
  • Incident Response: Collaborate with the SRE team to respond promptly to outages and incidents related to model operations, ensuring minimal downtime and rapid issue resolution

Job description


Role: Machine Learning Engineer/SRE
Location: Chicago, IL or 100% Remote
Duration: 12 Months
Rate: DOE

US Citizens and Green cards are Preferred. No 3rd party corp to corp

Job Description:
We are seeking a highly skilled and motivated Machine Learning Engineer who possesses expertise in developing, deploying, and managing machine learning models. In this role, you will be an integral part of our AI Engineering and Site Reliability Engineering (SRE) teams, responsible for managing Azure infrastructure for AI model development and deployment, monitoring and reporting model performance, and responding to outages/incidents related to model operations.
Key Responsibilities:
  • Manage Azure Infrastructure: Configure, maintain, and optimize Azure infrastructure for AI model development and deployment, ensuring scalability and performance.
  • Model Performance Monitoring: Implement and maintain monitoring systems to track model performance, proactively identifying and addressing issues as they arise.
  • Incident Response: Collaborate with the SRE team to respond promptly to outages and incidents related to model operations, ensuring minimal downtime and rapid issue resolution.
Skills and Qualifications:
  • Azure Infrastructure Experience: Proficiency in managing Azure infrastructure components, including virtual machines, storage, and networking, to support AI model development and deployment.
  • CI/CD Pipeline Experience: Experience with Continuous Integration/Continuous Deployment (CI/CD) pipelines, including the automation of model deployment processes.
  • Containerization in the Cloud: Strong knowledge of containerization technologies in the cloud, such as Docker and Kubernetes, for efficient deployment and scaling of machine learning models.
  • Machine Learning Expertise: Proficient in building and optimizing machine learning models, with a deep understanding of various Client algorithms and frameworks.
  • Programming Skills: Proficiency in programming languages commonly used in machine learning, such as Python and libraries like TensorFlow and PyTorch.
  • Data Management: Experience in data preprocessing, feature engineering, and data pipeline development for machine learning.
  • Collaborative Team Player: Excellent communication skills and the ability to work collaboratively with cross-functional teams, including AI engineers and SREs.
  • Documentation: Effective documentation skills to maintain clear and organized records of models, infrastructure configurations, and incident responses.
Preferred Qualifications:
  • Experience with cloud-based machine learning platforms (e.g., Azure Machine Learning).
  • Experience with CI/ CD tools to deploying Client services and applications specific to Azure cloud platform
  • Familiarity with DevOps practices and tools for automating infrastructure and deployments.
  • Knowledge of model versioning and model management tools.
  • Understanding of security best practices in AI model deployment.
  • Certifications in relevant areas, such as Azure certifications or machine learning certifications.


Job titles of folks with these skills may vary - e.g. MLOps Lead, MLOps Solution/Delivery Architect or Senior Client Engineer

Machine Learning Engineer Related jobs

Other jobs at Georgia IT, Inc.

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.