Site Reliability Engineer (SRE) - US

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

4+ years of hands-on SRE/DevOps experience., Proficiency in cloud environments such as AWS, GCP, or Azure., Strong scripting skills in Shell, Python, or Go., Experience with Kubernetes, Docker, and Linux in production..

Key responsibilities:

  • Manage and troubleshoot a complex multi-cloud SaaS platform.
  • Build monitoring tools and automate infrastructure processes.
  • Ensure site security and performance through testing and collaboration.
  • Lead incident response and promote best practices in site reliability.

Akeyless Security logo
Akeyless Security https://www.akeyless.io/
51 - 200 Employees
See all jobs

Job description

Description

Location: Remote, United States (East Coast Time Zone Preferred)

Akeyless Security delivers a cloud-native SaaS platform that integrates Vaultless Secrets Management with Certificate Lifecycle Management, Next Gen Privileged Access Management (Secure Remote Access), and Encryption Key Management to manage the lifecycle of all machine identities and secrets across all environments. 

Trusted by Fortune 100 companies and industry leaders, Akeyless is redefining identity security for the modern enterprise, delivering the world’s first unified Secrets & Machine Identity platform designed to prevent the #1 cause of breaches - compromised identities and secrets. Backed by the world’s leading cybersecurity investors and global financial institutions including JVP, Team8, NGP Capital, and Deutsche Bank. 

We are seeking an experienced and talented Site Reliability Engineer (SRE) to play a crucial role in the development of our highly robust, multi-cloud, and multi-region SaaS platform.

As an SRE at Akeyless, you will join a high-performing team, responsible for leading and maintaining a complex multi-cloud platform and promptly addressing any issues that arise. This role operates within a dynamic and agile environment, utilizing cutting-edge technologies.

Responsibilities:

  • Running a complex hybrid cloud solution and troubleshooting problems as they arise using automation whenever possible.
  • Building monitoring tools and alerting capabilities.
  • Building a multi-cloud infrastructure and platform components.
  • Site security, collaborating with our Security Engineering team.
  • Ensuring site performance and capabilities by participating in performance, load, and stress testing.
  • Promote SRE principles and operational readiness within Akeyless engineering, emphasizing cloud engineering best practices.
  • Assessing and determining root cause analysis of problems, turning them into opportunities to positively impact performance, reliability, functionality and security.
  • Advocating for the end customer and delivering a customer experience that exceeds expectations.



Requirements

  • 4+ years of hands-on SRE/DevOps experience.
  • Monitoring scalable production systems for rapidly growing global infrastructure.
  • Architect and implement automation for cloud infrastructure.
  • Integrating new tools into our systems, such as monitoring, configuration, alerting etc.
  • Experience in Cloud environments (AWS, GCP, Azure).
  • Resolve NOC escalations and help prevent reiteration of incidents.
  • Leading the NOC processes, procedures and automations.
  • Diagnose and troubleshoot complicated technical cases.
  • Develop, augment and maintain Ops documentation.
  • Excellent scripting skills and experience (Shell, Python, Go).
  • Plans & executes independently as well as acting as a strong team player.
  • Experience with Kubernetes and Docker in production - MUST.
  • Experience with Linux - MUST.


Advantages:

  • Responsibility for high-performance SaaS platform operation - huge advantage.
  • Ability to root cause analysis skills and big-picture thinking.
  • Ability to document technical information.
  • Networking knowledge, Protocols (e.g, TCP/IP, HTTP), Network Operations.
  • Develop, augment and maintain Ops documentation.





Base salary: $130K-$160K

In addition: Company Stock Options + Benefits

The compensation package depends on experience


Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Troubleshooting (Problem Solving)
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs