Offer summary
Qualifications:
Domain knowledge in LLM evaluation and data curation., Extensive experience in designing and implementing LLM benchmarking..Key responsabilities:
- Own, develop, and optimize LLM evaluation processes and methodologies.
- Generate synthetic data, curate labels, and deliver robust production code.
- Revamp benchmarking methods to assess LLMs for safety and effectiveness.
- Co-author research papers, patents, and presentations with the team.