Machine Learning Engineer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Expertise in machine learning, particularly Vision-Language Models and chemical machine learning., Strong background in document parsing and data extraction techniques., Experience with model development, fine-tuning, and deployment in production environments., Familiarity with NLP, computer vision, and chemistry-aware machine learning advancements..

Key responsabilities:

  • Lead the development and deployment of machine learning models for chemical data extraction.
  • Build and innovate data preprocessing and post-processing pipelines to enhance data capture rates.
  • Collaborate with cross-functional teams to define data requirements and validate extracted information.
  • Monitor model performance and iterate on tools to improve efficiency and reduce manual intervention.

Terray Therapeutics logo
Terray Therapeutics Scaleup http://www.terraytx.com/
51 - 200 Employees
See all jobs

Job description

Company Overview: Terray Therapeutics is a venture-backed biotechnology company led by pioneers and long-time leaders in artificial intelligence, synthetic chemistry, automation, and nanotechnology. We’re generating chemical data purpose-built to propel drug discovery into the information age — and we’re doing it on a larger scale and faster than has ever before been possible.

Our closed loop system generates precise chemical datasets at unrivaled scale that work seamlessly with AI to systematically map biochemical interactions between small molecules and causes of disease. Iterative cycles of virtual molecular design and experimentation power AI and machine learning models, which in turn guide the next cycle of design. With a chemistry engine that measures billions of interactions daily and becomes increasingly precise with every cycle, we can answer an unprecedented array of questions — deriving insights that enable us to predictably create drugs for patients in need.

Position Summary: Terray Therapeutics is seeking a Machine Learning Engineer to lead the development of cutting-edge models and tools for extracting chemical data from patents, scientific publications, and other documents. This role combines expertise in machine learning (Vision-Language Models), chemical machine learning, and document parsing to improve the accuracy and efficiency of chemical data curation. You will collaborate with chemists, data engineers, machine learning engineers, and curation professionals to build scalable solutions that accelerate research and decision-making in the chemical sciences, and contribute to Terray’s industry-leading datasets for drug development

The Core Responsibilities Of This Position Are

  • ML Model Development & Deployment
    • Extend existing VLM pipelines for data extraction.
    • Stay on top of latest VLM models, vision tokenizers, and reasoning models using any means necessary to increase the accuracy and throughput of our ingestion pipeline.
    • Develop assessment models which use Terray's internal data to maintain the quality and consistency of mined datasets.
    • Fine-tune models to solve edge cases (shorthands, abstract associations between structures and tables, etc.).
    • Eventually transition into working on machine reasoning over the datasets extracted, once the extraction pipeline is fully matured.

  • Data Pipeline Innovation
    • Build robust preprocessing pipelines to handle diverse PDF formats (e.g., scanned, text-based, etc.).
    • Develop post-processing rules and validation frameworks to improve data capture rates and reduce errors.
    • Implement active learning workflows to iteratively refine models using feedback from curation teams.
    • Success quantified in terms of information gain based on datasets augmented with the results of your extraction models, and timeline to ingest data on mission-directed subjects.

  • Team Leadership & Collaboration
    • Manage and mentor curation professionals who operationalize ML tools, ensuring alignment with project goals.
    • Collaborate with domain experts to define data requirements and validate extracted chemical information.
    • Work with software engineers to integrate models into production workflows (e.g., APIs, cloud services).

  • Research & Improvement
    • Stay current with advances in NLP, computer vision, and chemistry-aware ML.
    • Propose and test novel approaches to handle edge cases (e.g., handwritten text, legacy patents, etc.).

  • Performance Monitoring
    • Track model metrics (precision, recall) and curation team efficiency.
    • Iterate on tools to reduce manual intervention and accelerate data delivery.
Experience and Qualifications: Part of Terray’s success is nurtured by a hands-on work environment where everyone is accountable, vested in a vision of excellence, and actively taking part in the success of the business. Terray supports a positive work environment where employees can feel engaged, recognized and empowered to be creative.

Required Qualifications

  • Tangible, solid evidence you can succeed in this role.

Compensation Details: $ 147,000 - 226,800 (annually) depending on knowledge; participation in the Company's option plan; 3% retirement safe harbor contribution; fully-paid medical, dental, vision, life and disability benefits; flexible work hours; professional development; and access to state-of-the-art tools and datasets.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Active Learning
  • Mentorship
  • Team Leadership
  • Collaboration
  • Problem Solving

Machine Learning Engineer Related jobs