Logo for Periodic Labs

Distributed Training Engineer

Roles & Responsibilities

  • Experience with training on clusters with β‰₯5,000 GPUs
  • Knowledge of 5D parallel LLM training
  • Familiarity with distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, TorchTitan
  • Experience optimizing training throughput for large scale Mixture-of-Expert models

Requirements:

  • Optimize, operate and develop large-scale distributed LLM training systems
  • Work closely with researchers to bring up, debug, and maintain mid-training and reinforcement learning workflows
  • Build tools and directly support frontier-scale experiments
  • Contribute to open-source large scale LLM training frameworks

Job description

About Periodic Labs

We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. We are well funded and growing rapidly. Team members are owners who identity and solve problems without boundaries or bureaucracy. We eagerly learn new tools and new science to push forward our mission.

About the role

You will optimize, operate and develop large-scale distributed LLM training systems that power AI scientific research. You will work closely with researchers to bring up, debug, and maintain mid-training and reinforcement learning workflows. You will build tools and directly support frontier-scale experiments to make Periodic Labs the world’s best AI + science lab for physicists, computational materials scientists, AI researchers, and engineers. You will contribute open-source large scale LLM training frameworks.

You might thrive in this role if you have experience with:

  • Training on clusters with β‰₯5,000 GPUs

  • 5D parallel LLM training

  • Distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, TorchTitan

  • Optimizing training throughput for large scale Mixture-of-Expert models

Related jobs

Other jobs at Periodic Labs

We help you get seen. Not ignored.

We help you get seen faster β€” by the right people.

πŸš€

Auto-Apply

We apply for you β€” automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.