Hands-on experience with Kubernetes and AWS (or another major cloud provider) in production
Experience working with Docker containers in production environments
Programming experience in at least one of the following: Scala, Java, or Python
Requirements:
Design, build, and maintain scalable production infrastructure
Develop and support a robust ML pipeline and data lake architecture
Improve and evolve build and deployment processes, transitioning toward full CI/CD
Manage, secure, and scale Kafka clusters running on Kubernetes
Job description
This is a remote position.
About the Company
A fast-growing digital health technology organization is focused on improving cardiovascular health and addressing hypertension through advanced, science-driven solutions.
The product helps users better understand and manage their heart health - one of the leading causes of mortality worldwide. The mission is to empower individuals to take an active role in their health journey through digital therapeutics, enabling longer and healthier lives.
The solution is offered via employers and healthcare plans, working in a B2B2C model with large enterprise clients. The company is backed by top-tier investors and has recently experienced rapid growth in both revenue and customer base.
Role Overview
Due to continued expansion, the team is looking for a DevOps / Cloud Infrastructure Engineer to take ownership of infrastructure-related initiatives.
In this role, you will play a key part in building, scaling, and securing production systems, as well as improving reliability and operational efficiency.
Key Responsibilities
Design, build, and maintain scalable production infrastructure
Develop and support a robust ML pipeline and data lake architecture
Improve and evolve build and deployment processes, transitioning toward full CI/CD
Manage, secure, and scale Kafka clusters running on Kubernetes
Work closely with development and security teams to enhance infrastructure and compliance
Create monitoring systems, dashboards, alerts, and internal tools to detect and resolve issues proactively