Match score not available

Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience), 3+ years of experience in a Site Reliability Engineering or similar role.

Key responsabilities:

  • Design, implement, and maintain scalable AWS cloud infrastructure
  • Manage Kubernetes clusters, CI/CD pipelines, system security
  • Troubleshoot production issues, optimize system performance
  • Collaborate with dev teams, participate in On Call rotation
Cerbo EHR logo
Cerbo EHR SME https://cer.bo/
11 - 50 Employees
See more Cerbo EHR offers

Job description

 The Company

Cerbo is a high-growth healthcare SaaS company, doing our part in the medical market to support holistic lifestyles and personalized medicine. Our software – Cerbo EHR – is a cloud-based electronic health records (EHR) and patient portal software system. Healthcare offices across the country – and some around the world – use Cerbo for most everything they do in their day-to-day operations. Cerbo originally started as a developer’s nights-and-weekends project. And has grown into one of the leading EHR systems for functional or “root cause” medicine and membership- or cash-based clinics. Because of our unique origins, we often approach things a bit differently. That is, success for us is not just about the bottom line. It’s more about providing a great product, operating with integrity, and supporting our clients and our team. During the past four years our team has grown, and thousands of practitioners and patients use our product. To this end, we’re looking for a Site Reliability Engineer to join our growing team.

What You’ll Do

As the Site Reliability Engineer (SRE), you will play a pivotal role managing the future of our technology. You will work with our current SRE and engineering team to tune, optimize and enhance our Amazon Web Services Infrastructure. If you're passionate about building and maintaining highly available, scalable systems and thrive in a fast-paced environment, we'd love to hear from you!


Primary Responsibilities

  • Design, implement, and maintain scalable and reliable cloud infrastructure on AWS
  • Manage and optimize Kubernetes clusters using Amazon EKS
  • Develop and maintain Infrastructure as Code using Terraform
  • Implement and improve CI/CD pipelines using GitHub Actions and ArgoCD
  • Ensure system security and implement best practices
  • Monitor and optimize system performance using Grafana and Prometheus
  • Track our AWS spending and suggest ways to cut operating costs
  • Troubleshoot and resolve complex issues in production environments
  • Collaborate with development teams to improve application reliability and performance
  • Participate in On Call rotation with other SREs and engineering team membe

Required Skills

  • Extensive experience with AWS services and best practices
  • Proficiency in managing Kubernetes clusters, particularly Amazon EKS
  • Strong knowledge of Helm for Kubernetes package management
  • Extensive experience with Infrastructure as Code, specifically Terraform
  • Familiarity with CI/CD pipelines, particularly GitHub Actions
  • Advanced Linux administration skills
  • Solid understanding of networking concepts and protocols
  • Experience in implementing and maintaining security best practices
  • Proficiency in using monitoring and observability tools, especially Grafana and Prometheus
Qualifications
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience)
  • 3+ years of experience in a Site Reliability Engineering or similar role
  • Strong problem-solving skills and attention to detail
  • Excellent communication skills and ability to work in a team environment
  • Certifications in AWS, Kubernetes, or other relevant technologies are a plus

Compensation & Benefits

  • Competitive compensation based on experience
  • Comprehensive health, dental and vision benefits
  • 401(k) plan with matching company contribution
  • Short-term disability & long-term disability insurance
  • Paid Time Off and company holidays 
  • Full suite of remote working tools and processes

 Location: 100% Remote

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. 

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving
  • Communication
  • Teamwork

Site Reliability Engineer (SRE) Related jobs