Senior Data Engineer

extra holidays - extra parental leave
Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Over 2 years of experience as a Data Engineer., Proficiency in Databricks, Spark, or Scala for data processing., Experience with building scalable data pipelines in cloud environments., Knowledge of CI/CD practices and system monitoring..

Key responsibilities:

  • Design, build, and maintain ETL/ELT data pipelines in Databricks.
  • Collaborate with AI/ML teams to support data modeling and retrieval.
  • Develop serverless APIs to expose data to frontend applications.
  • Monitor data quality, lineage, and reliability using best practices.

CME logo
CME SME https://www.gotocme.com/
201 - 500 Employees
See all jobs

Job description

This is a remote position.

We are seeking a selfmotivated, intellectually curious Data Engineer to join our Data Science and Solutions team. This engineer will be responsible for building robust, scalable data pipelines using Databricks on AWS, integrating a wide range of data sources and structures into our AI and analytics platform. We have built our ‘minimum viable product’ and are now scaling up to support multitenancy in a highly secure environment.

The ideal candidate has more than 2 years’ experience in Databricks, and preferably building scalable, highquality data pipelines in a distributed, serverless cloud environment. They will be wellversed in CICD best practices, system monitoring and the Databricks control surface as you will be building infrastructureascode to deploy secure, isolated, and monitored environments and data pipelines for our end users and AI agents. Most of all, you will be an expert in collaboration in a distributed, remote environment, a team player, and always learning.


Data Pipeline Development

  • Design, build, and maintain ETLELT pipelines in Databricks to ingest, clean, and transform data from diverse product sources.
  • Construct gold layer tables in the Lakehouse architecture that serve both machine learning model training and realtime APIs.
  • Monitor data quality, lineage, and reliability using Databricks best practices.
    • AIDriven Data Access Enablement

      • Collaborate with AIML teams to ensure data is modeled and structured to support natural language prompts and semantic retrieval using 1st and 3rd party data sources, vector search and Unity Catalog metadata.
      • Help build data interfaces and agent tools to interact with structured data and AI agents to retrieve and analyze customer data with rolebased permissions.
        • API & Serverless Backend Integration

          • Work with backend engineers to design and implement serverless APIs (e.g., via AWS Lambda with TypeScript) that expose gold tables to frontend applications.
          • Ensure APIs are performant, scalable, and designed with data security and compliance in mind.
          • Utilize Databricks and other APIs to implement provisioning, deployment, security and monitoring frameworks for scaling up data pipelines, AI endpoints, and security models for multitenancy.


            • Requirements
              • 3+ years of experience as a Data Engineer or related role in an agile, distributed team environment with a quantifiable impact on business or technology outcomes.
              • Proven expertise with Databricks, including job and workflow orchestration, change data capture and medallion architecture.
              • Proficiency in Spark or Scala for data wrangling and transformation on a wide variety of data sources and structures.
              • Practitioner of CICD best practices, testdriven development and familiarity with the MLOps AIOps lifecycles.
                  • Proven ability to work in an agile environment with product managers, frontend engineers, and data scientists.
                    • Preferred Skills

                      • Familiarity with AWS Lambda (Node.jsTypeScript preferred) and API Gateway or equivalent serverless platforms, knowledge of API design principles and working with RESTful or GraphQL endpoints.
                      • Exposure to Reactbased frontend architecture and the implications of backend data delivery on UIUX performance – including endtoend telemetry to measure performance and accuracy for the enduser experience.
                      • Experience with AB testing, experiment and inference logging and analytics.


Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Teamwork
  • Collaboration
  • Problem Solving

Data Engineer Related jobs