This is a remote position.
We are looking for a highly skilled Data Scientist to design and deploy end-to-end production pipelines for multimodal data synthesis. You will focus on building sophisticated VAE architectures, advanced video processing modules, and scalable synthetic data generation systems using a phased, Agile delivery approach.
Build standalone, testable modules for video metadata extraction, critical frame selection, and automated scene analysis.
Lead the design of VAE-based systems for complex data imputation and synthesis.
Execute a three-phase delivery model: 1) Extraction & Architecture, 2) Synthesis & Imputation, and 3) Cross-component optimization.
Implement rigorous integration testing and quality metrics to ensure the fidelity of synthetic outputs.
Deep expertise in VAE architecture design, training, and latent space manipulation for high-dimensional data synthesis and imputation.
Proven experience in 3D CNNs, scene change detection (inter-frame histograms), and motion analysis (optical flow/peak detection).
Ability to generate and fuse synthetic data across tabular, text, audio, and video formats using statistical modeling and Gaussian copulas.
Advanced Python skills with a focus on modular design, production-level pipelines, and ffmpeg integration for audio/video handling.
Familiarity with specialized techniques such as MDH/CDS/DSM exclusion and multimodal merger integration.

DBServer

DBServer

Globant

Tabby

Depop

BilgeAdam Technologies

BilgeAdam Technologies

BilgeAdam Technologies