Match score not available

Data Engineer

fully flexible

Remote:

Full Remote

Work from:

Poland

Offer summary

Qualifications:

Proficiency in Big Data technologies, Experience with Databricks and Apache Airflow, Background in data processing and analysis, Knowledge of CI/CD and MLOps processes.

Key responsabilities:

Design scalable data processing pipelines
Develop applications for data aggregation and analysis

Addepto Startup http://www.addepto.com

51 - 200 Employees

See more Addepto offers

Job description

Addepto is a leading consulting and technology company specializing in AI and Big Data, helping clients deliver innovative data projects. We partner with top-tier global enterprises and pioneering startups, including Rolls Royce, Continental, Porsche, ABB, and WGU. Our exclusive focus on AI and Big Data has earned us recognition by Forbes as one of the top 10 AI companies.

As a Data Engineer, you will have the exciting opportunity to work with a team of technology experts on challenging projects across various industries, leveraging cutting-edge technologies. Here are some of the projects we are seeking talented individuals to join:

Design and development of a universal data platform for global aerospace companies. This Azure and Databricks powered initiative combines diverse enterprise and public data sources. The data platform is at the early stages of the development, covering design of architecture and processes as well as giving freedom for technology selection.
Data Platform Transformation for energy management association body. This project addressed critical data management challenges, boosting user adoption, performance, and data integrity. The team is implementing a comprehensive data catalog, leveraging Databricks and Apache Spark/PySpark, for simplified data access and governance. Secure integration solutions and enhanced data quality monitoring, utilizing Delta Live Table tests, established trust in the platform. The intermediate result is a user-friendly, secure, and data-driven platform, serving as a basis for further development of ML components.
Design of the data transformation and following data ops pipelines for global car manufacturer. This project aims to build a data processing system for both real-time streaming and batch data. We’ll handle data for business uses like process monitoring, analysis, and reporting, while also exploring LLMs for chatbots and data analysis. Key tasks include data cleaning, normalization, and optimizing the data model for performance and accuracy.

🚀 Your main responsibilities:

Design scalable data processing pipelines for streaming and batch processing using Big Data technologies like Databricks, Airflow and/or Dagster.
Contribute to the development of CI/CD and MLOps processes.
Develop applications to aggregate, process, and analyze data from diverse sources.
Collaborate with the Data Science team on Machine Learning projects, including text/image analysis and predictive model building.
Develop and organize data transformations using Databricks/DBT and Apache Airflow.
Translate business requirements into technical solutions and ensure optimal performance and quality.