Offer summary

Qualifications:

Deep technical skills in AWS Glue and hands-on experience with Python and SQL., Experience with Terraform or other Infrastructure-as-Code tools is mandatory., Good understanding of AWS services like S3, SNS, and Lambda., Upper-Intermediate language proficiency is required..

Key responsibilities:

Building and maintaining end-to-end ETL pipelines using AWS Glue and PySpark.

Integrating data sets using AWS services and authoring ETL processes with Python.

Monitoring ETL processes using CloudWatch events and validating data with Athena.

Providing production support and enhancements for ETL development.

Job description

We are looking for a Middle+/Senior Data Engineer (ETL, Python, PySpark):

Tech Level: Middle, Senior
Language Proficiency: Upper-Intermediate
FTE: 1
Employment type: Full time
Candidate Location: Poland
Working Time Zone: CET. The team is distributed across Poland (CET), Prague, India (IST), and the US (EST). CET is preferred for overlap, but flexibility is possible given the global setup.
Start: asap
Planned Work Duration: 12

Technology Stack: Python, SQL, AWS, PySpark, Snowflake (must), Github action (must), Terraform (optional), (Airflow, Datadog or Dynatrace are plus)

Customer Description:

Our Client is a leading global management consulting firm.

Numerous enterprise customers across industries rely on our Client's platform and services.

Project Description:

This project is part of a data initiative within the firm’s secure technology ecosystem.

The focus is on building and maintaining robust data pipelines that collect and process data from multiple enterprise systems such as Jira, GitHub, AWS, ServiceNow, and other cloud infrastructure platforms.

The objective is to enable leadership to gain actionable insights aligned with strategic outcomes, and to support product and service teams in targeting the right user groups and measuring the effectiveness of various GenAI productivity initiatives.

Project Phase: ongoing

Project Team: 10+

Soft Skills:

Problem-solving style of work
Ability to clarify requirements with the customer
Willingness to pair with other engineers when solving complex issues
Good communication skills

Hard Skills / Need to Have:

Deep technical skills in AWS Glue (Crawler, Data Catalog)
Hands-on experience with Python
SQL experience
Experience with Terraform or other Infrastructure-as-Code (IaC) tools is mandatory
CI/CD GitHub Actions
DBT modelling
Good understanding of AWS services like S3, SNS, Secrets Manager, Athena, and Lambda
Additionally, familiarity with any of the following is highly desirable: Jira, GitHub, Snowflake.

Hard Skills / Nice to Have (Optional):

Experience working with Snowflake and understanding of Snowflake architecture, including concepts like internal and external tables, stages, and masking policies.
Responsibilities and Tasks: Building and maintaining end-to-end ETL pipelines, primarily using AWS Glue and PySpark, with Snowflake as the target data warehouse:
New development, enhancements, defect resolution, and production support of ETL development using AWS native services.
Integration of data sets using AWS services such as Glue and Lambda functions.
Utilization of AWS SNS to send emails and alerts.
Authoring ETL processes using Python and PySpark.
ETL process monitoring using CloudWatch events.
Connecting with different data sources like S3 and validating data using Athena.

📩 Ready to Join?
We look forward to receiving your application and welcoming you to our team!

Required profile