Key Facts

Remote From:

India

Category: Cloud Architect

Fixed term

Senior (5-10 years)

English

Hard Skills

Apache Spark Data Engineering Python (Programming Language) Data Streaming SQL (Programming Language) IDoc (Intermediate Document) Data Management Azure Data Factory CI/CD DevOps +3 more

Roles & Responsibilities

Minimum 6 years of experience as a data engineer.
Expertise in Apache Spark, Delta Lake, and Azure Databricks.
Proficiency in Python programming and complex SQL queries.
Strong knowledge of Azure Data Services like ADLS Gen2, Synapse, and Data Factory.

Requirements:

Design and implement cloud-based data architectures and solutions.
Perform requirements analysis and translate them into technical designs.
Develop prototypes, PoCs, and MVPs for innovative data solutions.
Participate in technical project work and provide production support.

Job description

Location: remote

Project start: ASAP

Project end: December 31, 2025

Workload: Fulltime

Tasks and qualifications:
• Advice on architectural discussions and workshops with customers, understand their business and technical requirements to create the desired technical architectures and solutions on Cloud around data engineering, data lakes, data lakehouses, BI and MLAI.

• Perform requirements scoping exercises with project and use case stakeholders.

• Translate requirements into desired technical solution design.

• Carry out PoCs, prototypes and build MVPs for new innovative solutions and technology scouting on cloud (Azure) and Big Data Technologies.

• Participate in handson technical project work including actual project implementation tasks and some production support tasks (e.g. monitoring) as required.

• Evaluate and implement platform cost optimizations

• Create and maintain technical documentation for the use cases, solutions and data platform.

• Perform analysis of best practices and emerging concepts in Cloud based technologies with special focus on Data and Analytics cloud Ecosystem.

Absolute Musts:
• Indepth knowledge of Apache Spark and experience in optimizing and performance

tuning of Apache Spark data processing jobs

• Indepth knowledge of Delta Lake

• Indepth knowledge of Data Engineering on (Azure) Databricks

• Strong handson experience in Python programming and in writing complex SQL

queries

• Strong handson experience in building complex data pipelines in Azure Data Factory

• Strong handson experience in the following Azure Data Services: ADLS Gen2, Synapse

Serverless

• Experience in architecting and building enterprise grade data platforms on Cloud and

developing Big Data solution architectures preferably on Azure, incl.:

o Gathering requirements and mapping those to technical architectures

o Awareness of best practices for selecting a component mix of Cloud services (e.

g. ADF, Databricks, Synapse, etc.)

o Ability to assess pros and cons of architecture variations (e. g. Databricks vs

Snowflake vs. MS Fabric, Synapse vs MS Fabric Lakehouse, Databricks vs. open

source Spark, …).

Additionally required:
• Multiyear experience working (>6 years) in a data engineer role, incl.:

• Expertise in designing, building and maintaining large scale data pipelines as well as

processing (transforming, aggregating, wrangling) data.

• Proficient with Streaming technologies like Kafka, Spark Structured Streaming or

equivalent cloud services

• Good knowhow of cloud computing concepts and of Azure cloud platform, networking,

security and monitoring aspects

• Handson experience in using and applying IAC, CICD and DevOps practices in real Data

Analytics projects, preferably using Azure DevOps and Terraform.

• Knowledge of Microsoft Fabric will be added advantage.