Offer summary

Qualifications:

At least 3 years of experience in data engineering with expertise in Databricks, AI/ML, and big data tools., Proficiency in Python or Scala for ETL development., Strong understanding of Apache Spark, Delta Lake, and Databricks SQL., Familiarity with cloud platforms such as AWS, Azure, or GCP..

Key responsibilities:

Design, build, and maintain scalable data pipelines using Databricks and Apache Spark.

Integrate data from various sources into data lakes or data warehouses.

Implement and manage Delta Lake architecture for reliable, versioned data storage.

Collaborate with data analysts, scientists, and stakeholders to meet data needs.

Job description

Job Title: Data Engineer – Databricks
Company: V4C.ai
Type: Full-time

About V4C.ai
V4C.ai is a premier IT services consultancy and a proud partner of Databricks, the Data Intelligence Platform, driving strategic business transformation.
We partner with organizations to accelerate their journey towards AI-driven success by offering a comprehensive suite of Dataiku and generative AI services. Our expertise in implementation, optimization, and enablement empowers clients to harness the full potential of their data, unlocking significant competitive advantages and fostering innovation..

Key Responsibilities

Design, build, and maintain scalable data pipelines using Databricks and Apache Spark
Integrate data from various sources into data lakes or data warehouses
Implement and manage Delta Lake architecture for reliable, versioned data storage
Ensure data quality, performance, and reliability through testing and monitoring
Collaborate with data analysts, scientists, and stakeholders to meet data needs
Automate workflows and manage job scheduling within Databricks
Maintain clear and thorough documentation of data workflows and architecture
Work on Databricks-based AI/ML solutions, including machine learning pipelines, in collaboration with data science teams

Requirements

Experience: 3+ years in data engineering with strong exposure to Databricks, AI/ML, and big data tools
Technical Skills:

Proficient in Python or Scala for ETL development
Strong understanding of Apache Spark, Delta Lake, and Databricks SQL
Familiar with REST APIs, including Databricks REST API

Cloud Platforms: Experience with AWS, Azure, or GCP
Data Modeling: Familiarity with data lakehouse concepts and dimensional modeling
Version Control & CI/CD: Comfortable using Git and pipeline automation tools
Soft Skills: Strong problem-solving abilities, attention to detail, and teamwork

Nice to Have

Certifications: Databricks Certified Data Engineer Associate/Professional
Workflow Tools: Experience with Airflow or Databricks Workflows
Monitoring: Familiarity with Datadog, Prometheus, or similar tools
ML Pipelines: Experience with MLflow or integration of machine learning models into production pipelines