Key Facts

Remote From:

Egypt

Category: Data Engineer

Full time

Mid-level (2-5 years)

English

Hard Skills

Databricks Azure Data Factory Python (Programming Language) Microsoft Azure Azure Synapse Analytics Data Quality Assessment Data Modeling Cluster Development Data Lakes Apache Spark +5 more

Other Skills

•
Teamwork
•
Troubleshooting (Problem Solving)
•
Problem Solving
•
Willingness To Learn

Roles & Responsibilities

2+ years of professional experience in data engineering or closely related roles.
Strong Python skills for data processing, transformation, and automation, with experience using Pandas and PySpark.
Hands-on experience with Databricks (notebook development, clusters, job orchestration) and Azure Data Factory for building data pipelines.
Solid SQL skills (query writing, optimization, performance tuning) and familiarity with Delta Lake, data governance, and cloud security concepts.

Requirements:

Design, develop, and maintain scalable data pipelines and ETL/ELT workflows to support business intelligence and analytics use cases.
Build and optimize data ingestion processes using Azure Data Factory and Databricks, ensuring data quality and consistency across all layers of the data platform.
Transform and process large datasets using PySpark and Python, applying best practices for performance and maintainability.
Collaborate with data architects and senior engineers to implement and maintain data models aligned with organizational standards.

DeepSource GmbH

About DeepSource GmbH

Welcome to DeepSource, your premier partner for comprehensive IT and AI services. At DeepSource, we excel in connecting businesses with top-tier multilingual talent, specializing in Information Technology and Artificial Intelligence. Our expertise lies in sourcing highly skilled professionals and providing tailored talent solutions that enable companies to accelerate growth, drive innovation, and embrace digital transformation. As your trusted partner, DeepSource empowers you to find the right talent and harness AI-driven solutions, ensuring your business remains competitive and agile in today’s dynamic digital landscape.

Company type: Startup

Founded: 2018

Company size: 51 - 200

Website LinkedIn See all jobs →

Job description

We are looking for a motivated and technically solid L1 Data Engineer to join our growing Data & Analytics team. In this role, you will be responsible for designing, building, and maintaining the data architecture and infrastructure that supports our organization's data strategy. You will work hands-on to develop, test, and deploy reliable data solutions — ensuring pipelines are scalable, efficient, and aligned with business requirements.

This is an ideal opportunity for a data professional who is eager to deepen their expertise in cloud-native data platforms, particularly within the Microsoft Azure and Databricks ecosystem, and who thrives in a collaborative, fast-paced environment.

KEY RESPONSIBILITIES

• Design, develop, and maintain scalable data pipelines and ETL/ELT workflows to support business intelligence and analytics use cases.

• Build and optimize data ingestion processes using Azure Data Factory and Databricks, ensuring data quality and consistency across all layers of the data platform.

• Transform and process large datasets using PySpark and Python, applying best practices for performance and maintainability.

• Write and optimize complex SQL queries to support analytical reporting and data validation requirements.

• Collaborate with data architects and senior engineers to implement and maintain data models aligned with organizational standards.

• Monitor, troubleshoot, and resolve pipeline failures and data quality issues, applying root-cause analysis to prevent recurrence.

• Contribute to documentation of data pipelines, data dictionaries, and engineering standards.

• Support the team in exploring and evaluating new tools and approaches to continuously improve the data infrastructure.

Requirements

2+ years of professional experience in a Data Engineering or closely related role.

Strong proficiency in Python for data processing, transformation, and automation tasks.
Hands-on experience with Pandas for data manipulation and PySpark for distributed data processing.
Practical experience with Databricks, including notebook development, clusters, and job orchestration.
Experience building and managing data pipelines with Azure Data Factory.
Working knowledge of Azure Synapse Analytics, particularly Spark pool integration.
Solid SQL skills, including query writing, optimization, and performance tuning.
Familiarity with data engineering principles including incremental loading, data lake architecture, and Delta Lake.
Understanding of data governance and security concepts within a cloud data platform.

NICE TO HAVE

Experience with SQL Server migration projects, including schema conversion and data movement.
Exposure to Terraform for Azure infrastructure provisioning and management.
Familiarity with CI/CD practices applied to data engineering workflows.
Experience with Delta Sharing or Lakehouse Federation concepts.

CERTIFICATION REQUIREMENT

Candidates are expected to hold or be actively working toward the Databricks Certified Data Engineer Associate certification. This certification validates foundational knowledge across the following domains:
Databricks Lakehouse Platform architecture and capabilities
ETL and ELT workflows using Spark SQL and PySpark
Incremental data processing and structured streaming
Production pipeline development and orchestration
Data governance and security within the Databricks environment