Match score not available

Senior Site Reliability Engineer (Senior SRE) (REMOTE)

78% Flex
EXTRA HOLIDAYS - EXTRA PARENTAL LEAVE
Remote: 
Full Remote
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

7-10 years experience in agile software development, infrastructure operations, or application management, 4+ years experience in Site Reliability Engineering or DevOps production supporting highly available environments, Experience with Java, Spring Boot, AWS/cloud infrastructure, containerization, metrics/tracing/logging tools, incident management.

Key responsabilities:

  • Design and improve reliability for SailPoint SaaS services independently or collaboratively
  • Coach engineering teams on observability best practices, lead post-incident reviews, collaborate with developers
  • Develop automation tools, influence architectural design, drive operational excellence for global scale
  • Mentor teammates for quality, enhance system performance, deliver optimal customer experience
  • Manage cross-functional requirements, provide guidance for enterprise operations as part of the SRE Center of Excellence
Sailpoint logo
Sailpoint Computer Software / SaaS Large https://www.sailpoint.com/
1001 - 5000 Employees
HQ: Austin
See more Sailpoint offers

Job description

Logo Jobgether

Your missions

SailPoint is the leader in identity security for the cloud enterprise. Our identity security solutions secure and enable thousands of companies worldwide, giving our customers unmatched visibility into the entirety of their digital workforce, ensuring workers have the right access to do their job – no more, no less.  

Identity Security Cloud is SailPoint’s Identity as a Service (IDaaS) product, and the Senior Site Reliability Engineer (Senior SRE) will be a key player on our Reliability Engineering team servicing the Identity Security Cloude product suite. We are looking for engineers with broad experience in building and running distributed systems at global scale. If you enjoy analyzing complicated problems, innovating creative solutions, and collaborating across teams to build reliable, scalable, and impactful solutions, come join our Reliability Engineering team. We are a team of people that write software to solve scalability, observability, security, reliability, and operability problems.

What You’ll Make Happen:

  • Work independently or collaboratively on SailPoint SaaS services to design, develop, and improve end-to-end reliability and maintainability for all services

  • Coach engineering teams on observability best practices such as setting up well defined Service Level Objectives (SLOs).

  • Lead engineering teams through post-incident reviews to define effective preventive actions

  • Collaborate effectively with developers to increase system reliability through short-term embedding programs

  • Enable our engineering teams to scale our enterprise operations by providing guidance, best practices and support as part of an SRE Center of Excellence

  • Manage cross-functional requirements working with Engineering, Product, Services, and other departments

  • Develop and implement automation tools and processes to streamline operations and enhance system performance.

  • Be a mentor of quality for design reviews, code, test cases, automation, observability, root cause analysis, and self-healing

  • Influence architectural design, implementation, consolidation, and simplification for global scale

  • Focuses on expanding own skills and looking at improving their teammates' skills

  • Drive operational excellence to deliver frictionless operation, happy on call, and optimal customer experience

Requirements

  • 7-10 years experience in an agile software development, infrastructure operations, or application management

  • 4+ years experience in SRE (Site Reliability Engineering) or DevOps production operations supporting a highly available environment for SaaS software or cloud service provider.

  • 3+ years of experience in software development with Java, Spring Boot, and associated frameworks for infrastructure operations, and application management.

  • Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as code.

  • Experience with containerization technology and/or Kubernetes

  • Experience with metrics, tracing, and logging observability tools such as Prometheus, Grafana, Honeycomb, Jaeger, and Kibana

  • Experience with incident management, including conducting incident reviews

  • Experience with programming languages (Java, Python, Go, etc). Strong understanding of Linux, software development, systems, networking, and Cloud concepts

  • Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.

  • Have excellent communication skills- English fluency of C1 or higher preferred

  • Bachelor's degree in Computer Science or other technical discipline, or equivalent experience

Schedule

We have 2 schedules for this role which rotate:

  • Monday through Friday

  • Wednesday through Saturday

Benefits and Perks for Mexico based employees: 

SailPoint is committed to providing our Crew Members with a benefits program that is both comprehensive and competitive. Our benefits program offers medical/pharmacy and dental coverage, as well as financial security for our Crew Members and their families. All premiums are paid by SailPoint. 

  • Full time remote employment  

  • Competitive salaries 

  • Company sponsored healthcare coverage for you and your family 

  • Annual performance bonus 

  • Phone and internet reimbursement 

  • Private equity at certain levels  

  • 33% prima vacaciones 

  • Christmas Bonus equivalent to 30 days of salary  

SailPoint is an equal opportunity employer and we welcome everyone to our team.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Computer Software / SaaS
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Soft Skills

  • Interpersonal Skills
  • Teamwork

Go Premium: Access the World's Largest Selection of Remote Jobs!

  • Largest Inventory: Dive into the world's largest remote job inventory. More than half of these opportunities can't be found on standard platforms.
  • Personalized Matches: Our AI-driven algorithms ensure you find job listings perfectly matched to your skills and preferences.
  • Application fast-lane: Discover positions where you rank in the TOP 5% of applicants, and get personally introduced to recruiters with Jobgether.
  • Try out our Premium Benefits with a 7-Day FREE TRIAL.
    No obligations. Cancel anytime.
Upgrade to Premium

Find more Site Reliability Engineer jobs