Logo for Social Discovery Group

DevOps Engineer/ MLOps at Social Discovery Group

Roles & Responsibilities

  • Linux
  • Docker and Kubernetes
  • CI/CD with GitHub and Infrastructure as Code (Terraform / Ansible / Helm)

Requirements:

  • Support and develop ML/LLM infrastructure and inference services in development and production
  • Build fault-tolerant, scalable architectures for high-load environments, including GPU resource management (A100/H100) and CUDA/NVIDIA stack
  • Configure and maintain CI/CD pipelines for ML and backend solutions and implement Infrastructure as Code (Terraform/Ansible/Helm)
  • Collaborate with Data Science teams and .NET backend developers to deploy services and models to production

Job description

Social Discovery Group (SDG) is the 3rd largest social discovery company in the world, uniting 60+ brands with 500 million users. We solve the problems of loneliness, isolation, and disconnection by transforming virtual intimacy into the new normal. Our portfolio includes online communication platforms focusing on AI, game mechanics, and video streaming - Dating.com, DateMyAge, Cupid Media, Dil Mil, Kiseki, and others.

SDG invests in IT startups worldwide. Our investments include OpenAI, Patreon, Flo, Clubhouse, Woebot, Flure, Astry, Coursera, Academia.edu, and many others.

We bring together a team of like-minded people and IT professionals specialising in the creation and development of globally impactful social discovery products. Our international team of 1200 professionals and digital nomads works all over the world.

Our teams of digital nomads work remotely from Cyprus, Malta, the USA, Armenia, Georgia, Kazakhstan, Montenegro, Poland, Latvia, Serbia, Spain, Portugal, UAE, Israel, Turkey, Thailand, Indonesia, Japan, Hong Kong, Australia and many other locations.

In August 2024, we achieved Great Place to Work US Certification™! This achievement reflects our core belief that a truly exceptional workplace is built on trust, pride, and camaraderie—not just great perks.

We are looking for a Senior DevOps Engineer/ MLOps Engineer.

Your main tasks will be:

    • Support and development of ML/LLM infrastructure in dev and prod;
    • Deployment and maintenance of inference services for ML models;
    • Building a fault-tolerant and scalable infrastructure for high-load environments;
    • Configuring and maintaining CI/CD for ML and backend solutions;
    • Working with GPU infrastructure: efficient resource utilisation, GPU isolation, and partitioning (A100/H100);
    • Collaborating with the DS team and backend developers (.NET) to deploy services (including models) to production.

    We expect from you:

    • Linux
    • Docker
    • Kubernetes
    • CI/CD (GitHub)
    • IaC (Terraform / Ansible / Helm)
    • Experience with GPU infrastructure and the CUDA / NVIDIA stack
    • Understanding of how ML/LLM works
    • Experience with GPU partitioning / MIG (A100/H100) is a major plus
    • Monitoring and logging: Prometheus, Grafana, ELK / OpenSearch, or similar tools
    • Experience with AWS
    • Understanding of networking, fault tolerance, and scaling

      - Experience with GPU partitioning / MIG (A100/H100) is a major plus
      - Experience integrating with a .NET backend is a plus
      - Working knowledge of Python is a plus

    What do we offer:

    • REMOTE OPPORTUNITY to work full-time;
    • Vacation 28 calendar days per year;
    • 7 wellness days per year (time off) that can be used to deal with household issues, to lie down and recover without taking sick leave;
    • Bonuses up to $5000 for recommending successful applicants for positions in the company;
    • 50% payment for professional training, international conferences and meetings;
    • Corporate discount for English lessons;
    • Health benefits. According to the paychecks, if you are not eligible for corporate medical insurance, the company will pay up to $1,000 gross per employee per year. This can be spent on self-purchase of health insurance or on doctors’ fees for yourself and close relatives (spouse, children);
    • Workplace organisation. The company provides all employees with an equipped workspace and all necessary equipment (table, armchair, Wi-Fi, etc.) in our offices or co-working locations. At the other locations, the company provides reimbursement for workplace costs up to $1000 gross once every 3 years, according to the paychecks. This money can be spent on the rent of the co-working room, on equipping the working place at home (desk, chair, Internet, etc.) during those 3 years.
    • Internal gamified gratitude system: receive bonuses from colleagues and exchange them for our merchandise, team building activities, massage certificates, etc.

    Sounds good? Join us now!

    DevOps Engineer Related jobs

    Other jobs at Social Discovery Group

    We help you get seen. Not ignored.

    We help you get seen faster — by the right people.

    🚀

    Auto-Apply

    We apply for you — automatically and instantly.

    Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

    AI Match Feedback

    Know your real match before you apply.

    Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

    Upgrade to Premium. Apply smarter and get noticed.

    Upgrade to Premium

    Join thousands of professionals who got noticed and hired faster.