Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
Sword Health is on a mission to free 2 billion people from pain by pioneering the field of AI Care.
It is #1 in clinical and cost-saving outcomes because we have the most advanced AI, developed and overseen by the most clinically rigorous team.
Sword has developed the first platform to predict, prevent, and treat pain by starting with the expertise of world-class doctors of physical therapy and then building an interactive AI experience to deliver care that members can use anywhere, anytime, under a clinician’s supervision.
Are you ready to transform healthcare with cutting-edge AI? At Sword Health, we’re harnessing the power of Large Language Models (LLMs) and advanced data engineering to revolutionize how we deliver personalized care. We're looking for a Senior Data Engineer to join our dynamic Predict team, where you'll play a pivotal role in building the data infrastructure that drives our AI and machine learning projects from the ground up.
As a Senior Data Engineer, you'll design, develop, and scale systems that analyze complex health data, enabling AI models to predict, understand, and personalize care for millions of users. Your work will empower our AI to not only understand member behavior across a variety of health and pain conditions but also to continuously improve the care experience through data-driven insights.
What you’ll do:
Design, build, and maintain ETL/ELT data pipelines to efficiently ingest and process data from multiple sources (e.g., Cloud Storage, BigQuery) to support AI and machine learning models.
Collaborate with data science and AI teams to develop and implement AI models, leveraging machine learning (ML) and deep learning techniques to analyze musculoskeletal health and pain data, and to improve digital health interventions.
Architect and optimize data solutions to ensure a seamless, person-level identity across diverse datasets and sources (structured and unstructured data), supporting AI-driven insights.
Develop and implement data engineering strategies for processing large-scale datasets, including clinical notes, sensor data, and other medical records, using technologies such as PySpark and cloud-based data platforms (e.g., GCP, AWS).
Drive the data engineering efforts for AI and ML projects, supporting data pipeline design, data preparation, feature engineering, and ensuring high-quality data for model training and productionization.
Work closely with cross-functional teams (analytics, marketing, clinical operations) to translate business requirements into data-driven solutions, helping develop AI and machine learning products that address real-world healthcare problems.
Stay at the forefront of emerging technologies in AI, ML, and data engineering, experimenting with new tools, techniques, and approaches to continuously improve data processing and model deployment workflows.
About you:
Experience in data engineering, with a deep understanding of data pipelines, cloud infrastructure, and big data technologies.
Expertise in supporting AI/ML models—familiarity with LLMs and their integration into healthcare applications is a huge plus!
A passion for creating data-driven solutions that enhance the healthcare experience and deliver real-world impact.
A collaborative mindset, able to work across interdisciplinary teams including AIresearchers, data scientists, and healthcare professionals.
Experience with cloud platforms (AWS, GCP, Azure) and scalable data frameworks (Spark, Kafka, etc.).
Experience in designing and implementing scalable, efficient data pipelines, with a focus on SQL and modern data modeling techniques.
Experience in developing data engineering solutions using Python, and Pyspark, with particular experience in applying tools to AI and ML tasks.
Bonus points if you have:
Experience working with healthcare data, including medical claims, EHR, FHIR, or other clinical data formats.
Proven track record of creating production-ready ML models, including model deployment, monitoring, and retraining.
Experience with advanced ML tools and frameworks, such as TensorFlow, PyTorch, or scikit-learn, for model development and evaluation.
Strong ability to propose and implement changes to improve data architecture, scalability, and performance, specifically in the context of AI-driven applications.
Knowledge of data privacy and governance standards relevant to healthcare, including data security, anonymization, and compliance with healthcare regulations.
*Please note that this position does not offer relocation assistance. Candidates must possess a valid EU visa and be based in Portugal (for Portugal location).
Required profile
Experience
Level of experience:Senior (5-10 years)
Industry :
Health, Sport, Wellness & Fitness
Spoken language(s):
English
Check out the description to know which languages are mandatory.