Key Facts

Remote From:

Portugal

Category: Data Engineer

Full time

Senior (5-10 years)

English

Hard Skills

Apache Flink Apache Kafka Data Lakes Apache Iceberg AWS Glue Observability Data Lineage Apache Spark Contract Management Data Management +16 more

Other Skills

•
Communication
•
Teamwork
•
Reliability
•
Problem Solving

Roles & Responsibilities

Proven experience designing and operating data platforms at scale — warehouse, data lake, or lakehouse architectures in production.
Hands-on experience with a modern lakehouse table format (Iceberg preferred; Delta Lake or Hudi welcomed) and deep understanding of metadata layout, snapshots, and compaction.
Clear mental model of catalogs (REST, Polaris, Glue, Unity, Hive) and their trade-offs, including how compute stays detached from storage.
Exposure to at least one vendor lakehouse or query platform (Snowflake, Starburst, or Databricks) at an architectural level, plus strong experience with a distributed processing engine (Flink preferred; Spark acceptable).

Requirements:

Design and evolve Sword’s streaming lakehouse — the foundation that every data consumer in the company depends on.
Build and operate distributed streaming pipelines that move data at low latency and high reliability.
Own the durable workflows that coordinate complex data movement across systems.
Shape the platform’s API surface — the interface producers and consumers use so they never need to touch infrastructure.

Kaia Health

Computer Software / SaaS

About Kaia Health

Founded in 2016, Kaia Health is a leading digital therapeutics company that creates evidence-based treatments for a range of disorders including musculoskeletal conditions and COPD. Kaia Health uses innovative technology, including artificial intelligence and computer vision, and works with medical experts to create an interdisciplinary digital approach. This empowers and motivates individuals to take control and self-manage their condition with effective, non-pharmacological, digital alternatives at low costs. Kaia Health is a member of the Digital Therapeutics Alliance (www.dtxalliance.org)

Company type: Scaleup

Industry: Computer Software / SaaS

Founded: 2018

Company size: 51 - 200

Website LinkedIn See all jobs →

Job description

At Sword, we’re building AI to heal billions and unlock humanity’s full potential. In doing so, we’re pioneering AI Care, a fundamentally new approach to healthcare built for medical reasoning, safety, and real-time treatment, not generic technology applied after the fact. As both a clinical-centric frontier AI lab and an applied AI platform, Sword is reimagining how care is delivered at scale, removing traditional barriers like appointments, waiting rooms, and stigma so more people can access the care they need—and ultimately get back to lives lived in full.

Since 2020, Sword has expanded across physical therapy, women’s health, cardiometabolic, and mental health, and is now moving beyond the session to a fully AI-native, 24/7 care program that brings physical activity, therapeutic exercise, psychotherapy, nutrition, and behavior change into one connected experience. More than 700,000 members across three continents have completed over 10 million AI sessions, helping 1,000+ enterprise clients avoid more than $1 billion in unnecessary healthcare costs. Backed by 42 clinical studies, 44+ patents, and more than $500 million raised from leading investors including Khosla Ventures, General Catalyst, and Founders Fund, Sword is defining a new standard for healthcare.

Role

At Sword, data powers our mission to build a pain-free world. The Data Platform Team is building the foundation that makes data usable across clinical, operations, AI, and product teams: a modern streaming lakehouse, distributed processing for low-latency data movement, durable workflows coordinating complex operations, and an API-first surface that lets producers and consumers serve themselves — without tickets, without infra context, and increasingly through agentic interfaces.

This role is for an engineer who thinks about data platforms as products. You’ve built or operated data systems at scale, you reason fluently from storage format up to query engine, and you want to shape the self-service and agentic experiences layered on top of that foundation.

To get to know more about our Tech Stack, check here.

AI Proficiency at Sword Health

AI fluency is a core expectation at Sword Health. Every candidate is assessed against our three-level framework — be ready to share real examples of how AI is already part of how you work.

Explorer (Level 1) — Uses AI daily to boost personal productivity

Builder (Level 2) — Creates workflows and tools that elevate the whole team

Integrator (Level 3) — Embeds AI into products and processes at scale

Every hire must demonstrate at least Level 1. The expected level will vary depending on the seniority of the role.

What you’ll be doing

Design and evolve Sword’s streaming lakehouse — the foundation that every data consumer in the company depends on.

Build and operate distributed streaming pipelines that move data at low latency and high reliability.

Own the durable workflows that coordinate complex data movement across systems.

Shape the platform’s API surface — the interface producers and consumers use so they never need to touch infrastructure.

Drive evaluations and integrations with vendor data platforms, sitting inside the architectural trade-offs rather than just consuming the output.

Contribute to the self-service and agentic layer: interfaces designed to be consumed by humans, systems, and AI agents alike.

Partner with data engineers and analysts on contracts, governance, and lineage.

Build and maintain AI-ready data infrastructure that powers ML and AI-driven products across Sword.

Leverage AI coding assistants and LLMs to accelerate development, automate documentation, and raise code quality.

Work in a regulated environment where audit, compliance, and governance are part of every design.

What you need to have

Proven experience designing and operating data platforms at scale — warehouse, data lake, or lakehouse architectures in production.

Hands-on experience with a modern lakehouse table format — Iceberg strongly preferred; Delta Lake or Hudi also welcome. You understand how the format works under the hood: metadata layout, snapshots, manifests, compaction, copy-on-write vs. merge-on-read.

Clear mental model of catalogs (REST, Polaris, Glue, Unity, Hive) — their trade-offs, and how compute stays detached from storage.

Exposure to at least one vendor lakehouse or query platform — Snowflake, Starburst, or Databricks — at the level where you can reason about its architecture, not just use its UI.

Strong experience with a distributed processing engine — Flink strongly preferred; Spark also fine. You can reason about its internals, fine-tune a running job, and debug a pipeline that’s silently degrading.

Familiarity with durable execution — Temporal, Restate, or similar — or at minimum a solid mental model of what durable execution means and why it matters for data workflows.

Production experience building and operating APIs (REST or gRPC) at scale — good instincts about contracts, versioning, retries, rate limiting, and observability.

Solid understanding of Kafka and event-driven architectures (producers/consumers, partitioning, delivery semantics).

Comfortable in regulated environments (healthcare, fintech, gov) where audit, compliance, and data governance are part of every design.

Platform mindset: you design for self-service, API-first, and with systems and agents — not only humans — as legitimate consumers.

Bonus

Deeper familiarity with open/REST catalogs (Polaris, Nessie, Unity) beyond basic use.

Observability stack fluency (Prometheus, Grafana, OpenTelemetry).

Prior work on agentic or AI-facing API surfaces, or MCP-style interfaces.

Experience in HIPAA, FedRAMP, or SOC 2 environments.

dbt, DataHub, or data contract tooling exposure.

Mindset and Collaboration

Service orientation: you build APIs (and increasingly agent-facing tools) that others love to use.

Reliability-first: failure modes, retries, and observability are part of day-one design.

Cross-functional: you enjoy working with data engineers, analysts, and ML engineers and understanding their problems.

Documentation mindset: good APIs come with great docs — and good docs now means machine-readable too.

Iterative: you ship incrementally and improve based on feedback.

*This range includes base, variable and equity

These compensation bands are just the starting point. Once someone joins and proves they’re outlier talent, we adjust quickly to ensure their compensation aligns with their impact.

Our job titles may span more than one career level. Actual pay is determined by skills, qualifications, experience, location, market demand, and other factors. Compensation details listed in this posting reflect the base salary and any potential variable, bonus or sales incentives, and the Company’s estimation of the value of private company stock options, if applicable. The pay range is subject to change, future value of company stock options is not guaranteed, and compensation may be modified in the future. In addition to our total compensation, Sword offers a number of benefits as listed below.

Portugal - Sword Benefits & Perks:

• Health, dental and vision insurance

• Meal allowance

• Equity shares

• Remote work allowance

• Flexible working hours

• Work from home

• Discretionary vacation

• Snacks and beverages

Note: Please note that this position does not offer relocation assistance. Candidates must possess a valid EU visa and be based in Portugal.

Sword Health complies with applicable Federal and State civil rights laws and does not discriminate on the basis of Age, Ancestry, Color, Citizenship, Gender, Gender expression, Gender identity, Gender information, Marital status, Medical condition, National origin, Physical or mental disability, Pregnancy, Race, Religion, Caste, Sexual orientation, and Veteran status.

Ready to apply?

APPLY

Share ·