Offer summary

Qualifications:

Advanced knowledge of Apache Spark, Delta Lake, and Data Engineering on Azure Databricks., Strong programming skills in Python and SQL for complex data pipeline development., Extensive experience (over 6 years) in designing and maintaining large-scale data pipelines and cloud-based data platforms., Proficiency with Azure Data Services such as ADLS Gen2, Synapse, and Data Factory, along with cloud computing concepts..

Key responsibilities:

Advising on architectural discussions and workshops with clients to develop technical solutions on Cloud.

Performing requirements analysis and translating them into technical architecture designs.

Implementing proof-of-concepts, prototypes, and building MVPs for innovative cloud data solutions.

Participating in technical project work, including implementation and production support tasks.

Job description

Location: remote

Project start: ASAP

Project end: December 31, 2025

Workload: Fulltime

Tasks and qualifications:
• Advice on architectural discussions and workshops with customers, understand their business and technical requirements to create the desired technical architectures and solutions on Cloud around data engineering, data lakes, data lakehouses, BI and MLAI.

• Perform requirements scoping exercises with project and use case stakeholders.

• Translate requirements into desired technical solution design.

• Carry out PoCs, prototypes and build MVPs for new innovative solutions and technology scouting on cloud (Azure) and Big Data Technologies.

• Participate in handson technical project work including actual project implementation tasks and some production support tasks (e.g. monitoring) as required.

• Evaluate and implement platform cost optimizations

• Create and maintain technical documentation for the use cases, solutions and data platform.

• Perform analysis of best practices and emerging concepts in Cloud based technologies with special focus on Data and Analytics cloud Ecosystem.

Absolute Musts:
• Indepth knowledge of Apache Spark and experience in optimizing and performance

tuning of Apache Spark data processing jobs

• Indepth knowledge of Delta Lake

• Indepth knowledge of Data Engineering on (Azure) Databricks

• Strong handson experience in Python programming and in writing complex SQL

queries

• Strong handson experience in building complex data pipelines in Azure Data Factory

• Strong handson experience in the following Azure Data Services: ADLS Gen2, Synapse

Serverless

• Experience in architecting and building enterprise grade data platforms on Cloud and

developing Big Data solution architectures preferably on Azure, incl.:

o Gathering requirements and mapping those to technical architectures

o Awareness of best practices for selecting a component mix of Cloud services (e.

g. ADF, Databricks, Synapse, etc.)

o Ability to assess pros and cons of architecture variations (e. g. Databricks vs

Snowflake vs. MS Fabric, Synapse vs MS Fabric Lakehouse, Databricks vs. open

source Spark, …).

Additionally required:
• Multiyear experience working (>6 years) in a data engineer role, incl.:

• Expertise in designing, building and maintaining large scale data pipelines as well as

processing (transforming, aggregating, wrangling) data.

• Proficient with Streaming technologies like Kafka, Spark Structured Streaming or

equivalent cloud services

• Good knowhow of cloud computing concepts and of Azure cloud platform, networking,

security and monitoring aspects

• Handson experience in using and applying IAC, CICD and DevOps practices in real Data

Analytics projects, preferably using Azure DevOps and Terraform.

• Knowledge of Microsoft Fabric will be added advantage.