Logo for Lilt

AI Benchmark Engineer - Native Language Specialist | Hausa

Key Facts

Remote From: 
Fixed term
Senior (5-10 years)
Hindi, Japanese, Czech, Arabic, German, Korean, Spanish, Turkish, English

Other Skills

  • Quality Assurance
  • Quality Control
  • Communication
  • Analytical Skills
  • Quality Driven
  • Teamwork
  • Detail Oriented
  • Problem Solving

Roles & Responsibilities

  • 5+ years of industry software engineering experience
  • Native Hausa-speaking with high English proficiency
  • Strong proficiency in Python, standard shell scripting, and data processing
  • Extensive experience with Terminal/CLI-based development workflows and familiarity with coding agents

Requirements:

  • Task engineering and evaluation of coding agents to design high-signal benchmarks
  • Asset creation: build realistic task environments in the native language with assets remaining in the target language
  • Prompting and failure analysis in the native language to identify weaknesses and edge cases
  • Implementation verification and calibration: develop reference implementations, deterministic verifier scripts, and calibrate task difficulty across model tiers using Terminal-Bench configurations

Job description

We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows.

We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches.

Note this is a remote, freelance opportunity

Target Languages: Spanish, German, Czech, Turkish, Arabic (Egyptian), Korean, Japanese, Hausa, Hindi, Marathi.

Key Responsibilities

- Task Engineering: Evaluating Coding Agents.

- Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling.

- Prompting & Translation: finding failure points where AI does not work, in your native language

- Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary).

- Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus).

- Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.

Required Qualifications

- Experience: 5+ years of industry experience in software engineering.

- Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities.

- Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.

- Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing.

- Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.

- Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including:

- Encoding/decoding robustness and Unicode normalization.

- Locale-dependent conventions (collation, casing, non-Gregorian dates).

- Text I/O, toolchain interoperability, and safe string operations.

- (For specific languages) Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts.

If interested, please submit your application including a latest copy for your CV in English.

AI is changing how the world communicates — and LILT is leading that transformation.

LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world.

Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.

Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy.

At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at recruiting@lilt.com.

LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices.

AI Specialist Related jobs

Other jobs at Lilt

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.