Logo for 42dot

Senior AI Data Pipeline Engineer (Autonomous Driving)

Roles & Responsibilities

  • Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field
  • Minimum of 7 years of experience in Data Engineering, DataOps or ML Platform roles
  • Proficient in Python and solid experience in Python SDK development
  • Solid hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models

Requirements:

  • Develop high scale, reliable data extraction pipeline to extract millions of raw data from data collection fleet and convert to high-value scene data
  • Develop data labeling pipelines to perform the auto labeling inferences for autonomous driving algorithms
  • Build up the data lakehouse for autonomous driving scene dataset, including the sensor data, calibration data, as well as annotation data
  • Bootstrap and maintain infrastructure for data platform components—data processing pipeline, database, data lakehouse and data serving

Job description

We are looking for the best

About Us

42dot is a mobility AI company committed to solving mobility challenges with software and AI. As the Global Software Center of Hyundai Motor Group, 42dot pioneers the future of mobility by advancing the development of software-defined vehicles.

We develop safety-first, user-centric software-defined vehicle technologies that deliver the latest performance through continuous updates like smartphones. By advancing software and AI technology, 42dot envisions a world where everything is connected and moves autonomously through a self-managing urban transportation operating system.

Our AI Data Pipeline Engineers build up the core data processing pipelines and datasets readiness for autonomous driving cutting edge algorithms. We develop the distributed system of a scalable data pipeline for large-scale dataset (millions of scenes), as well as high-performance data serving SDKs for ML model training / evaluation. The data pipelines we deliver could highly improve the efficiency of ML model development lifecycle, including training, evaluation, deployment, as well as monitoring in the cloud environment.

Responsibilities

  • Develop high scale, reliable data extraction pipeline to extract millions of raw data from data collection fleet and convert to high-value scene data

  • Develop data labeling pipelines to perform the auto labeling inferences for autonomous driving algorithms

  • Develop advanced autonomous driving data SDK, including scene data search, datasets preparation, dataset loading, etc.

  • Build up the data lakehouse for autonomous driving scene dataset, including the sensor data, calibration data, as well as annotation data

  • Dig into performance bottlenecks all along the data processing pipelines, from data processing latency, data search latency to Test Procedure (TP) coverage.

  • Bootstrap and maintain infrastructure for data platform components—data processing pipeline, database, data lakehouse and data serving.

  • Collaborate with cross-functional teams, including ML algorithm, ML application, and Cloud Infra to align data pipelines with overall autonomous driving system architecture.

Qualifications

  • Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field.

  • Minimum of 7 years of experience in Data Engineering, DataOps or ML Platform roles

  • Proficient in Python and solid experience in Python SDK development

  • Solid hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models

  • Solid working experience in Databases (e.g., MongoDB, PostgreSQL, etc)

  • Extensive experience with data technologies and architectures such as Data Warehouse (e.g., Hive) or Lakehouse (e.g., Delta Lake)

  • Experience with Apache Spark or other big data computing engines

  • Excellent leadership and communication skills, with a demonstrated ability to lead technical projects

Preferred Qualifications

  • Experience with autonomous vehicle sensor data (e.g., LiDAR, camera, radar)

  • Experience with ML model training lifecycle (e.g., data preparation, model training / validation / deployment, etc)

  • Understanding of modern AI frameworks (e.g., PyTorch, TensorFlow etc.)

  • Understanding data governance principles, data privacy regulations, and experience implementing security measures to protect data

Interview Process

  • Application Review - Coding Test - 1st interview - 2nd interview - Offer Negotiation - Hiring

  • The screening procedures may vary depending on the position, schedule, or other circumstances.

    You will be individually notified of the screening schedule and results via the email address provided in your application.

Compensation

  • $133,000 - $254,000

Additional Information

  • In accordance with fair hiring practices, do not include any personal information unrelated to your job qualifications (e.g., Social Security Number, family relations, marital status, age, photo, physical condition, place of birth, etc.) in your resume.

  • All documents must be submitted in PDF format and under 30MB in size.

  • If you experience issues uploading your resume, please send it along with the job posting URL to recruit@42dot.ai.

  • We strongly encourage applications from U.S. veterans and candidates eligible for employment preference under applicable laws.

  • Qualified individuals with disabilities are encouraged to apply and will receive consideration under the Americans with Disabilities Act (ADA).

  • 42dot does not accept unsolicited resumes and will not pay fees for any such submissions. Equal Opportunity Statement

  • 42dot is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status.

※ Please review the following information before applying.

Data Engineer Related jobs

Other jobs at 42dot

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.