Key Facts

Remote From:

Full time

Senior (5-10 years)

English

Hard Skills

Python (Programming Language) Apache Kafka Apache Kafka Python (Programming Language) Apache Iceberg Apache Spark Data Modeling Large Language Modeling Apache Spark Apache Airflow +19 more

Other Skills

•
Teamwork
•
Communication

Roles & Responsibilities

Proficiency in Python and PySpark for large-scale data processing
Hands-on experience with data lake formats such as Apache Iceberg, Delta Lake, or Hudi and with Kafka/event-driven architectures
Proven ability to design, build, and orchestrate scalable data pipelines using tools like Airflow or Dagster, with strong SQL and data modeling skills
Platform-oriented mindset with ownership, strong cross-functional collaboration, and clear communication to technical and non-technical stakeholders; experience building AI-ready data infrastructure and using AI coding assistants/LLMs

Requirements:

Lead migration of workloads to Apache Iceberg and help establish a robust lakehouse architecture powering data-driven decisions
Architect and implement scalable batch and streaming data pipelines using Spark and Flink, including API integrations and formal data contracts with Backend Engineering teams
Contribute to unified data lineage and governance using DataHub, ensuring clean, governed, and accessible data pipelines for ML models and AI-driven products
Provide cross-functional support to the Core Team, promote platform-oriented solutions, and leverage AI coding assistants/LLMs to accelerate development

Job description

ABOUT THE OPPORTUNITY

Join a digital healthcare company revolutionizing physical therapy through AI and wearable technology. As a Senior Data Engineer, you'll architect the lakehouse infrastructure that powers virtual physical therapy platforms helping patients recover from musculoskeletal conditions through personalized, remotely-guided exercise programs. This full remote position from Portugal offers the chance to build mission-critical data systems that directly improve patient outcomes, reduce pain, and lower healthcare costs. You'll spearhead the migration to Apache Iceberg format, establish robust data pipelines, and create AI-ready data infrastructure that powers machine learning models across the platform — all while working with cutting-edge technologies in a healthcare environment where data quality and governance are paramount.

PROJECT & CONTEXT

You'll lead the migration of existing workloads to the Iceberg format, establishing and maturing the foundational lakehouse architecture that will serve as the backbone for data-driven decision making. Your responsibilities include architecting and building robust batch and streaming data pipelines using Spark and Flink, collaborating closely with Backend Engineering teams on API integrations and formal data contract establishment, and contributing to a unified lineage and governance framework using DataHub. You'll provide comprehensive support to the Core Team in adopting new data platform capabilities, ensuring solutions are platform-oriented and designed for broad organizational use. Building AI-ready data infrastructure is central — you'll ensure clean, governed, and accessible data pipelines that power machine learning models and AI-driven products across the platform, while leveraging AI coding assistants and LLMs to accelerate development and improve code quality.

WHAT WE'RE LOOKING FOR (Required)

Demonstrated proficiency with Python and PySpark for data processing at scale
Hands-on experience with data lake formats: Iceberg, Delta Lake, or Hudi
Solid understanding of Kafka and event-driven architectures
Proven experience building and orchestrating data pipelines at scale
Strong SQL proficiency with comprehensive data modeling knowledge
Familiarity with workflow orchestration tools: Airflow, Dagster, or similar
Platform-oriented mindset: developing solutions for broad organizational use, not individual purposes
Ownership mentality: committed to seeing problems through to resolution
Clear communication skills: ability to articulate complex technical concepts to non-technical stakeholders
Highly collaborative: excels working alongside backend engineers, data engineers, and analysts
Pragmatic approach: balancing ideal solutions with practical delivery timelines
Experience building and maintaining AI-ready data infrastructure
Ability to leverage AI coding assistants and LLMs to accelerate development
English proficiency at B2 Upper Intermediate level minimum

NICE TO HAVE (Preferred)

Demonstrated expertise with Flink or comparable streaming frameworks
Proficiency in DBT and familiarity with the modern data stack
Experience with modern data platforms: BigQuery, Trino, Snowflake, or Databricks
Proven background developing self-service data platforms
Experience working in regulated healthcare or compliance-sensitive environments
Knowledge of data governance frameworks and metadata management
Understanding of healthcare data standards (HL7, FHIR)
Familiarity with DataHub or similar data catalog/lineage tools
Experience with infrastructure-as-code and CI/CD for data pipelines

Languages Required: English (B2 Upper Intermediate minimum)

Work Model: Full Remote — must be based in Portugal

Experience Level: Senior

Ready to apply?

APPLY

Share ·

Data Engineer Related jobs

Portugal Data Engineer

Staff Data Engineer

Just Now

Syndio

Full time

Database SchemaData Warehouse SystemsData ArchitectureData LineagePolicy Enforcement

Senior Data Engineer

Just Now

Precision Value & Health

Full time

Snowflake (Data Warehouse)SQL (Programming Language)Python (Programming Language)Data Duplication ManagementData Quality Assessment

Data Analytics Engineer

Just Now

Mutual of Enumclaw

Full time

Data ModelingSnowflake (Data Warehouse)SQL (Programming Language)Performance Systems AnalysisAgile Software Development

Senior Data Engineer (f/m/d)

Just Now

GROPYUS

Full time

Data EngineeringData QualityData ArchitectureData ArchitectureGraph Database

Senior Data Engineer (Data Infrastructure & Business Insights) at Cove Living Pte Ltd

Today

Cove

Full time

BigQuerySQL (Programming Language)Data ArchitectureData ModelingData Quality Assessment

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.

Senior Data Engineer (Python/PySpark/Kafka) - Full Remote Portugal

Key Facts

Hard Skills

Other Skills

Roles & Responsibilities

Requirements:

Job description

ABOUT THE OPPORTUNITY

PROJECT & CONTEXT

WHAT WE'RE LOOKING FOR (Required)

NICE TO HAVE (Preferred)

Data Engineer Related jobs

Staff Data Engineer