Senior Infrastructure Engineer - SRE & DevOps

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

5+ years of experience in building and maintaining cloud infrastructure at scale, preferably in AWS or GCP., Proficient in Python, Bash, Terraform, and Kubernetes., Experience with distributed ML training jobs and managing compute clusters with 1,000+ GPUs is ideal., Hands-on experience with physical hardware and datacenter management is a plus..

Key responsabilities:

  • Maintain and grow multi-cloud compute infrastructure for ML model training and drug discovery.
  • Develop configuration and procedures for monitoring, resource allocation, and deployment automation.
  • Enhance orchestration scheduling framework to improve execution throughput and compute utilization.
  • Collaborate with the infrastructure team to support ongoing research and development efforts.

Genesis Therapeutics logo
Genesis Therapeutics Biotech: Biology + Technology SME http://www.genesistherapeutics.ai/
11 - 50 Employees
See all jobs

Job description

Genesis Therapeutics is building a world-class computational team to solve problems in drug discovery through machine learning, biophysical simulation, and computational chemistry. We are looking for a senior infrastructure engineer that is excited to help develop new medicines and play a critical role in building out our AI platform.


You Will:

  • Work on our infrastructure team to maintain and grow our multi-cloud compute infrastructure that supports our ML model training, computational chemistry research, and ongoing drug discovery efforts
  • Build out our configuration and procedures for monitoring, resource allocation, and deployment automation, as we continue to grow our autoscaling compute clusters to handle larger workloads
  • Work on orchestration scheduling framework to increase our execution throughput, reliability, and compute utilization across heterogeneous pipelines


You Are:

  • 5+ years of experience building and maintaining cloud infrastructure at scale, e.g. within AWS or GCP
  • Proficient with Python, Bash, Terraform, and Kubernetes
  • Ideally, experience building and maintaining compute clusters running distributed ML training jobs with 1,000+ GPUs
  • Nice to have: hands-on experience with physical hardware + datacenter management


What We Offer:

  • The opportunity to work on high impact infrastructure that is used to accelerate the discovery of new medicines
  • A world-class, tight-knit team of good-hearted people across software, machine learning, computational chemistry, medicinal chemistry, and biology
  • Competitive salary and equity. Medical, dental, and vision insurance, and a 401(k) program

Required profile

Experience

Industry :
Biotech: Biology + Technology
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Teamwork
  • Problem Solving

Infrastructure Engineer Related jobs