Match score not available

Principal ML Engineer – Structural Biology

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

PhD or equivalent experience in ML, computational biology, or structural biology., Deep experience in building and training transformer-based models using frameworks like PyTorch., Understanding of data challenges in structural biology and ability to design scalable workflows., Experience with MLOps tools and infrastructure, including Docker and Kubernetes..

Key responsabilities:

  • Drive the technical approach for ML applications in structural biology.
  • Design and implement model extensions for protein structure prediction tasks.
  • Collaborate with customers to define data preprocessing and benchmarking strategies.
  • Build and maintain scalable, production-ready ML systems for drug discovery.

Apheris logo
Apheris Computer Hardware & Networking Startup https://www.apheris.com/
11 - 50 Employees
See all jobs

Job description

About the role
At Apheris, we power federated data network in life sciences to address the data bottleneck in training highly performant ML models. Publicly available, molecular datasets are insufficient to train high-quality ML models that meet industry requirements. Our product addresses this by hosting networks where biopharma organizations collaboratively train higher quality models on their combined data. The Apheris product is a set of drug discovery applications - enriched with the proprietary data of network participants. Our federated computing infrastructure with built-in governance and privacy controls ensure that the data IP and ownership always stays with the data custodians.

As we are doubling down on structural biology use cases as a focus area within our drug discovery work, we are looking for a Principal ML Engineer to lead the technical direction for our structural biology models. This is a hands-on, high-impact role focused on advancing the state of the art in applying foundational models to structural biology problems. You’ll work closely with our leadership team and will serve as the technical authority on ML modeling, architecture, and experimentation in this domain. While this is not a people management role, you will guide and mentor other engineers and researchers on a content level.

You should bring deep expertise in training and deploying transformer-based models for protein structure prediction and related tasks. You must also understand the application of these models in drug discovery workflows and have a track record of setting strategy, breaking down complex technical problems, and delivering impactful ML systems.

If you want to be part of a mission-driven team building cutting-edge AI systems for life sciences – and you know what it takes to move from foundational models to domain-specific impact – this role is for you. 
What you will do
  • Drive the technical approach for ML applications in structural biology, particularly around fine-tuning and extending foundational models like OpenFold and ESMFold.
  • Design and implement model extensions for specific tasks such as protein complex and binding affinity prediction, including data distillation, benchmarking, and evaluation pipelines.
  • Work with our customerand potentially academic partners to define data preprocessing, selection, and benchmarking strategies for novel training tasks involving protein structures, complexes, and multimodal biological data.
  • Collaborate directly with our customers and partners to identify their data integration needs and develop tailored strategies for leveraging their data within a secure, federated network.
  • Build and maintain scalable, production-ready ML systems including training, inference, and deployment pipelines.
  • Collaborate cross-functionally to ensure models address real-world drug discovery needs.
  • Mentor and guide team members on a content level, supporting the planning and breakdown of complex structural biology modeling projects.
  • Influence strategic decisions on model architecture, data infrastructure, and model deployment.
  • Contribute to publications or open-source contributions where relevant.


What we expect from you

  • By month 3: Develop a deep technical understanding of the Apheris product and how it maps to the current Structural Biology use-cases we are working on. Take ownership of a structural biology modeling stream. Build relationships with product and engineering leadership. Start a roadmap and experiment plan for adapting a pretrained structural biology model to one high-value use case.
  • By month 6: Deliver the first working model extension (e.g. binding affinity head), with a documented benchmarking framework and reproducible pipeline. Work with our customers and collaborating partners to understand their data landscape then deliver and document reproducible data pipelines to enable their data on the aforementioned model.
  • By month 12: Lead multiple ML efforts in structural biology and demonstrate measurable progress in model performance and real-world impact. Mentor colleagues and set strategic direction for the domain.
You should apply if
  • You have a PhD (or equivalent experience) in ML, computational biology, or structural biology, and a track record of applying ML to real-world protein structure or drug discovery problems.
  • You have deep experience building and training transformer-based models (e.g. AlphaFold, ESMFoldOpenFold) using PyTorchPyTorch Lightning, or similar frameworks.
  • You understand the data challenges of structural biology and can design scalable preprocessing, training, and evaluation workflows.
  • You’ve delivered ML systems at scale, including CI/CD, model versioning, and GPU-based distributed training.
  • You are comfortable working with modern MLOps tools and infrastructure, including Docker, Kubernetes, cloud platforms, and orchestration tools.
  • You’re comfortable navigating complex technical landscapes and can break down and drive execution on ambitious modeling plans.
  • You understand how structural biology models are used in the drug discovery lifecycle and can align your work to practical use cases.
Bonus points if
  • You have experience in federated learning, privacy-preserving ML, or secure model training.
  • You’ve published in top-tier ML or biology journals/conferences (e.g., NeurIPS, ICML, Nature Methods, Bioinformatics)
  • You’ve contributed to open-source ML or bioinformatics tooling.
  • You have experience guiding technical direction in a fast-paced, research-oriented environment.
What we offer you
  • Industry-competitive compensation, incl. early-stage virtual share options
  • Remote-first working – work where you work best, whether from home or a co-working space near you
  • Great suite of benefitsincluding a wellbeing budget, mental health benefits, a work-from-home budget, a co-working stipend and a learning and development budget
  • Regular team lunches and social events
  • Generous holiday allowance
  • Quarterly All Hands meet-up at our Berlin HQ or a different European location
  • A fun, diverse team of mission-driven individuals with a drive to see AI and ML used for good
  • Plenty of room to grow personally and professionally and shape your own role
About Apheris
Apheris powers federated life sciences data networks, addressing the critical challenge of accessing proprietary data locked in silos due to IP and privacy concerns. Publicly available datasets are insufficient to train high-quality ML models that meet industry requirements. Our product addresses this by enabling life sciences organizations to collaboratively train higher quality models on complementary data from multiple parties. We are now doubling down on two key areas of interest: structural biology and ADMET. 
Logistics
Our interview process is split into three phases:
  1. Initial Screening: If your application matches our requirements, we invite you to an initial video call to explore the fit. In this 30-45 minutes interview, you will get to know us and the role. The interviewer will be interested in your relevant experiences and skills, as well as answer any question on the company and the role itself that you may have.
  2. Deep Dive: In this phase, a domain expert from our team will assess your skills and knowledge required for the role by asking you about meaningful experiences or your solutions for specific scenarios in line with the role we are staffing.
  3. Final Interview: Finally, we invite you for up to three hours of targeted sessions with our founders, talking about our culture and meeting future co-workers on the ground.

Required profile

Experience

Industry :
Computer Hardware & Networking
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Mentorship
  • Strategic Thinking
  • Collaboration
  • Problem Solving

Related jobs