Match score not available

Data Engineer (Vision)

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's or Master's in Computer Science or related STEM field, 3-6+ years of experience in data engineering, Strong proficiency in Python or a systems programming language, Experience with video processing libraries and cloud infrastructure.

Key responsabilities:

  • Build scalable pipelines for video data ingestion and processing
  • Optimize video storage, retrieval, and transformation workflows

Reality Defender logo
Reality Defender Computer Hardware & Networking Startup https://realitydefender.com/
11 - 50 Employees
See all jobs

Job description

About Reality Defender

Reality Defender provides accurate, multi-modal AI-generated media detection solutions to enable enterprises and governments to identify and prevent fraud, disinformation, and harmful deepfakes in real time. A Y Combinator graduate, Comcast NBCUniversal LIFT Labs alumni, and backed by DCVC, Reality Defender is tdhe first company to pioneer multi-modal and multi-model detection of AI-generated media. Our web app and platform-agnostic API built by our research-forward team ensures that our customers can swiftly and securely mitigate fraud and cybersecurity risks in real time with a frictionless, robust solution.

Youtube: Reality Defender Wins RSA Most Innovative Startup

Why we stand out:

  • Our best-in-class accuracy is derived from our sole, research-backed mission and use of multiple models per modality

  • We can detect AI-generated fraud and disinformation in near- or real time across all modalities including audio, video, image, and text.

  • Our platform is designed for ease of use, featuring a versatile API that integrates seamlessly with any system, an intuitive drag-and-drop web application for quick ad hoc analysis, and platform-agnostic real-time audio detection tailored for call center deployments.

  • We’re privacy first, ensuring the strongest standards of compliance and keeping customer data away from the training of our detection models.

Role Overview

We are looking for a Data Engineer (Video Data) to enhance video ingestion, processing, and augmentation workflows to support our deepfake detection models and benchmarking efforts. The ideal candidate will have expertise in video processing pipelines, data engineering at scale, and experience working with large, diverse video datasets. You’ll work on building scalable, high-performance workflows for acquiring, storing, and transforming large-scale video datasets. You will collaborate with machine learning engineers and researchers to optimize data workflows, ensuring high-quality input for AI-driven fraud detection and disinformation mitigation.

Key Responsibilities
  • Video Ingestion & Processing: Build scalable pipelines for batch and streaming video data, automating preprocessing, transcoding, and augmentation.

  • Infrastructure Optimization: Optimize video storage, retrieval, and transformation workflows, including compression, metadata extraction, and format conversions.

  • Data Sourcing & Augmentation: Expand video dataset coverage through API integrations, web scraping, and social media ingestion.

  • Collaboration & Research Support: Partner with ML teams to enhance training datasets, support benchmarking efforts, and contribute to synthetic media generation.

Basic Skills & Experience
  • A Bachelor’s or Master’s degree in Computer Science, Data Engineering, Physics, Mathematics, Electrical Engineering, or a related STEM field

  • 3-6+ years of experience in data engineering or another developer role with a focus on video data.

  • Strong software development experience in Python or a systems programming language, with experience using video processing libraries (e.g. libav, OpenCV).

  • Experience with video dataset augmentation and processing

  • Experience building systems on cloud infrastructure

  • Experience with SQL and NoSQL databases

Preferred Skills
  • Exposure to deepfake detection techniques and synthetic media processing.

  • Familiarity with machine learning concepts and how video data is used in AI/ML pipelines.

  • Experience building distributed systems for data streaming and processing at scale.

  • Experience with GPU acceleration for video processing.

  • Hands-on experience with social media scraping and API-based video collection

  • Familiarity with the statistical foundations of bias and balanced dataset construction

  • Experience with data lake or lake house technology, such as DataBricks

  • Knowledge of multi-modal AI models

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Computer Hardware & Networking
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration

Data Engineer Related jobs