This role is for one of our clients
Compensation: $90-$110 per hour
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Engineering & Built Environment domain.
Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical drawings, project specifications, engineering reports), web search, and code execution — each paired with a clearly defined ground truth output and an objective evaluation rubric. You will be responsible for authoring tasks that test an AI's ability to interpret engineering documentation, follow multi-step instructions, and produce precise, well-structured outputs.
Requirements
We expect a minimum commitment of 15–20 hours per week.
Ideal candidates have 3+ years of hands-on experience in one or more of the following sub-domains:
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.

Huzzle.com

Talkspace

SynergisticIT

Nexcess

Teams Squared

Weekday (YC W21)

Weekday (YC W21)

Weekday (YC W21)