Match score not available

Senior Site Reliability Engineer

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Experience in Python and Terraform/OpenTofu, Strong knowledge of Linux and Docker, Familiarity with GCP and CI platforms, Experience with monitoring tools like Grafana.

Key responsabilities:

  • Own and maintain production infrastructure
  • Implement and maintain Infrastructure as Code
  • Assist customers with installations
  • Take ownership of system monitoring and CI pipelines
Scalr logo
Scalr http://www.scalr.com
51 - 200 Employees
See all jobs

Job description

Company Overview
Scalr is a SaaS product company that offers everything necessary to scale Terraform. We place a strong emphasis on Terraform / Opentofu, DevOps, GitOps, and the "everything as code" philosophy, prioritizing consistency and simplicity. Scalr builds a management layer atop Terraform / Opentofu, which assists DevOps in scaling across their entire organization. As an engineering organization, we also embrace a DevOps approach, researching cloud services, adopting best practices, and utilizing Terraform / Opentofu throughout. This enables us to better understand our customers' challenges and use cases.

As we expand our offerings, we are seeking a skilled Senior Site Reliability Engineer with a passion for pushing the boundaries of technology to solve complex problems.

Position Overview
As a Senior SRE, you will contribute in multiple ways: by designing new architecture components, promoting and enforcing effective SRE and DevOps practices, and driving strategic technical improvements. You will play an integral role in our platform, contributing significantly to its reliability, scalability, and efficiency. The main infrastructure technology stack includes GCP, GitHub (including GA for CI/CD), Terraform, DataDog, Sentry, Grafana.

At Scalr, we believe that the best software is produced when engineers take pride and ownership in the work they accomplish. Consequently, engineers are expected to provide customer support. We value troubleshooting skills and customer empathy because, ultimately, writing good code and helping customers succeed lay the foundation for building great companies.

Qualifications:
πŸ”Έ Python (experience in Python scripting is enough)
πŸ”Έ Terraform/OpenTofu (for GCP)
πŸ”Έ Strong knowledge of Linux (RHEL/Debian, bash scripting)
πŸ”Έ Docker
πŸ”Έ Kubernetes
πŸ”Έ Google Cloud Platform
πŸ”Έ Experience with monitoring and logging tools such as Grafana, Prometheus, Datadog, New Relic, etc.
πŸ”Έ Experience with CI platforms such as GitHub Actions, Drone, CircleCI, etc.
πŸ”Έ Strong written and verbal communication skills.

Would Be a Plus:
πŸ”Έ Leading SRE teams or initiatives 
πŸ”Έ Experience with GitOps, Argo CD, Flux CD or similar
πŸ”Έ Chef, Omnibus, Ruby
πŸ”Έ JavaScript for GitHub Actions

As part of our team, you will work on:
πŸ”Έ Own and maintain production infrastructure in GCP and Kubernetes
πŸ”Έ Implement and maintain Infrastructure as Code in Terraform
πŸ”Έ Take part in rolling out new releases and improving the efficiency and reliability of releases
πŸ”Έ Assist customers with on-prem installations of our product
πŸ”Έ Work with developers to ensure customer data security and isolation in Docker
πŸ”Έ Take ownership of system monitoring, logging and alerting
πŸ”Έ Own and maintain complex CI pipelines
πŸ”Έ Maintain a self-service test environment platform

Advantages/opportunities:
πŸ”ΈOur product itself is DevOps-oriented
πŸ”ΈWorking with complex CI pipelines that involve cross-project end-to-end tests and continuous delivery.
πŸ”ΈParticipation in migration from monolith architecture

Scalr Offers:
🌟 Work with an exciting engineering product in an enjoyable environment
πŸ‘€ The opportunity to see how your ideas and visions are realized
πŸ’° Attractive compensation and benefits package
πŸ“… Long-term contract and tax compensations
🌐 Flexible schedule and possibility to work entirely remotely
🩺 Medical insurance
πŸ–οΈ 20 working days of paid vacation and 2 weeks of paid sick leaves

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Empathy
  • Communication
  • Troubleshooting (Problem Solving)

Site Reliability Engineer (SRE) Related jobs