Site Reliability Engineer (SRE) US

Work set-up: 
Full Remote
Contract: 
Salary: 
140 - 140K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

4+ years of hands-on SRE/DevOps experience., Experience with cloud environments such as AWS, GCP, Azure., Proficiency in scripting languages like Shell, Python, or Go., Experience with Kubernetes, Docker, and Linux in production environments..

Key responsibilities:

  • Managing and troubleshooting a complex multi-cloud SaaS platform.
  • Building monitoring tools and alerting systems.
  • Ensuring site security and performance through testing and optimization.
  • Providing on-call support to maintain high availability.

Akeyless Security logo
Akeyless Security https://www.akeyless.io/
51 - 200 Employees
See all jobs

Job description

Description

Location: Remote, United States (East Coast Time Zone Preferred)

Akeyless Security delivers a cloudnative SaaS platform that integrates Vaultless Secrets Management with Certificate Lifecycle Management, Next Gen Privileged Access Management (Secure Remote Access), and Encryption Key Management to manage the lifecycle of all machine identities and secrets across all environments.

Trusted by Fortune 100 companies and industry leaders, Akeyless is redefining identity security for the modern enterprise, delivering the world’s first unified Secrets & Machine Identity platform designed to prevent the #1 cause of breaches compromised identities and secrets. Backed by the world’s leading cybersecurity investors and global financial institutions including JVP, Team8, NGP Capital, and Deutsche Bank.

We are seeking an experienced and talented Site Reliability Engineer (SRE) to play a crucial role in the development of our highly robust, multicloud, and multiregion SaaS platform.

As an SRE at Akeyless, you will join a highperforming team, responsible for leading and maintaining a complex multicloud platform and promptly addressing any issues that arise. This role operates within a dynamic and agile environment, utilizing cuttingedge technologies.

Responsibilities:

  • Running a complex hybrid cloud solution and troubleshooting problems as they arise using automation whenever possible.
  • Building monitoring tools and alerting capabilities.
  • Building a multicloud infrastructure and platform components.
  • Site security, collaborating with our Security Engineering team.
  • Ensuring site performance and capabilities by participating in performance, load, and stress testing.
  • Promote SRE principles and operational readiness within Akeyless engineering, emphasizing cloud engineering best practices.
  • Assessing and determining root cause analysis of problems, turning them into opportunities to positively impact performance, reliability, functionality and security.
  • Advocating for the end customer and delivering a customer experience that exceeds expectations.
  • Providing oncall support, as needed, to ensure high availability and reliability of our production environment.


    • Requirements

      • 4+ years of handson SREDevOps experience.
      • Monitoring scalable production systems for rapidly growing global infrastructure.
      • Architect and implement automation for cloud infrastructure.
      • Integrating new tools into our systems, such as monitoring, configuration, alerting etc.
      • Experience in Cloud environments (AWS, GCP, Azure).
      • Resolve NOC escalations and help prevent reiteration of incidents.
      • Leading the NOC processes, procedures and automations.
      • Diagnose and troubleshoot complicated technical cases.
      • Develop, augment and maintain Ops documentation.
      • Excellent scripting skills and experience (Shell, Python, Go).
      • Plans & executes independently as well as acting as a strong team player.
      • Experience with Kubernetes and Docker in production MUST.
      • Experience with Linux MUST.

        • Advantages:

          • Responsibility for highperformance SaaS platform operation huge advantage.
          • Ability to root cause analysis skills and bigpicture thinking.
          • Ability to document technical information.
          • Networking knowledge, Protocols (e.g, TCPIP, HTTP), Network Operations.
          • Develop, augment and maintain Ops documentation.




            • Base salary: $140K$170K

              In addition: Company Stock Options + Benefits

              The compensation package depends on experience


Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving

Site Reliability Engineer (SRE) Related jobs