Match score not available

Senior Data Engineer (Spark, AWS)

fully flexible

Remote:

Full Remote

Contract:

Full time

Experience:

Senior (5-10 years)

Work from:

Poland

Offer summary

Qualifications:

Strong experience in Apache Spark, Proficient in AWS and Databricks, Knowledge of Hadoop and Snowflake, Experience with CI/CD and MLOps processes, Familiarity with DBT and Apache Airflow.

Key responsabilities:

Lead design of data processing architectures
Design streaming pipelines using Spark
Implement data management and governance processes
Collaborate with Data Science on projects
Translate business needs into technical solutions

Addepto Information Technology & Services Startup https://addepto.com/

11 - 50 Employees

See more Addepto offers

Job description

Addepto is a leading consulting and technology company specializing in AI and Big Data, helping clients deliver innovative data projects. We partner with top-tier global enterprises and pioneering startups, including Rolls Royce, Continental, Porsche, ABB, and WGU. Our exclusive focus on AI and Big Data has earned us recognition by Forbes as one of the top 10 AI companies.

As a Senior Data Engineer, you will have the exciting opportunity to work with a team of technology experts on challenging projects across various industries, leveraging cutting-edge technologies. Here are some of the projects we are seeking talented individuals to join:

Design and development of a universal data platform for global aerospace companies. This Azure and Databricks-powered initiative combines diverse enterprise and public data sources. The data platform is in the early stages of development, covering the design of architecture and processes and allowing freedom for technology selection.

Design and development of the data platform for managing electric and hybrid vehicle data. This project involves building a data pipeline for electric vehicle data, processing thousands of signals efficiently through streaming and batch services. The data powers IoT applications for business intelligence, customer support, maintenance, and AI insights, offering a chance to work with cutting-edge technology in electric mobility.
Design of the data transformation and following data ops pipelines for a global car manufacturer. This project aims to build a data processing system for both real-time streaming and batch data. We'll handle data for business uses like process monitoring, analysis, and reporting, while also exploring LLMs for chatbots and data analysis. Key tasks include data cleaning, normalization, and optimizing the data model for performance and accuracy.

🚀 Your main responsibilities:

Lead the design and development of scalable and efficient data processing architectures, infrastructure, and platform solutions for streaming and batch processing using Big Data technologies like Apache Spark, Hadoop, Databricks, Snowflake.
Design streaming pipelines using Apache Spark and Kafka.
Design and implement data management and data governance processes and best practices.
Contribute to the development of CI/CD and MLOps processes
Develop applications to aggregate, process, and analyze data from diverse sources.
Collaborate with the Data Science team on Machine Learning projects, including text/image analysis and predictive model building.
Develop and organize data transformations using DBT and Apache Airflow.
Translate business requirements into technical solutions and ensure optimal performance and quality.