Match score not available

Data Engineer

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Hands-on experience with ML technologies, Building complex data pipelines (ETL), Work experience in cloud platforms (GCP), Understanding of code management tools (GIT/SVN), Experience with data engineering tools and practices.

Key responsabilities:

  • Implement and design scalable data pipelines
  • Develop and maintain data models using guidelines
  • Document business glossary in data catalog
  • Evaluate and support business data models
  • Guide projects teams in data model mapping
Sales Consulting logo
Sales Consulting
51 - 200 Employees
See more Sales Consulting offers

Job description

Responsibilities:

  • Implementing and designing scalable, optimized data pipelines for (pre-) processing ETL for machine learning models.
  • Develop and maintain conceptual and logical data models using data modeling guidelines from the clients;
  • Document and maintain business glossary in the enterprise data catalog solution;
  • Evaluate business data models and physical data models for variances and discrepancies;
  • Support project team in adopting business data models;
  • Guide project team to map physical data models to business glossary.


Knowledge/Experience
:

     For senior experience:

  • Hands-on technologies and frameworks used in ML, like sklearn, MLFlow, TensorFlow;
  • Building complex data pipelines e.g. ETL;
  • Experience working in cloud environment, data cloud platforms (e.g. GCP);
  • Understanding of code management repositories like GIT/SVN;
  • Familiar with software engineering practices like versioning, testing, documentation, code review;
  • Experience with Apache Airflow;
  • Experience in setting up both SQL as well as noSQL databases;
  • Experience with monitoring and observability (ELK stack);
  • Deployment and provisioning with automation tools e.g. Docker, Kubernetes, Openshift, CI/CD;
  • Knowledge of MLOps architecture and practices;
  • Relevant work experience in ML projects;
  • Knowledge of data manipulation and transformation, e.g. SQL;
  • Setting up/troubleshoot SQL and NoSQL databases.

     For medium experience:

  • Design and Develop Data Pipelines: Create efficient and scalable data pipelines using GCP services such as Dataflow (Apache Beam), Dataproc (Apache Spark), and Pub/Sub;
  • Data Storage Solutions: Implement and manage data storage solutions using GCP services such as BigQuery, Cloud Storage, and Cloud SQL;
  • Data Analysis and Reporting: Optimize SQL queries for data analysis and reporting in BigQuery.


Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Data Engineer Related jobs