Match score not available

Staff SRE

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Demonstrable experience in building and operating large-scale distributed systems., Solid experience with server-side web development and Infrastructure-as-Code, particularly in Golang and Terraform., Strong understanding of security practices related to SRE and experience with major cloud providers., Exceptional communication skills and a strong customer focus..

Key responsabilities:

  • Lead the development and refinement of SRE tools and processes to enhance software delivery and operational efficiency.
  • Shape and communicate the technical roadmap, driving OKRs to improve system reliability and scalability.
  • Mentor and coach SRE team members, promoting a culture of learning and growth within the organization.
  • Proactively identify and resolve performance and scalability bottlenecks in systems and infrastructure.

G2i Inc. logo
G2i Inc. Human Resources, Staffing & Recruiting TPE https://g2i.co/
11 - 50 Employees
See all jobs

Job description

About the Job: 

Software powers the world, and LaunchDarkly empowers all teams to deliver and control the best software. We serve trillions of feature flags daily to help teams ship better software faster and eliminate risk for companies big and small.

We're based in downtown Oakland and growing quickly. You'll help us tackle some of the most challenging engineering problems around, like delivering feature flags to hundreds of millions of users worldwide in milliseconds.

In this role, you'll oversee the health of our core systems and reliability tooling, respond to and mitigate incidents quickly, and identify and drive opportunities that make our core services more resilient. You will also identify and develop force-multiplying capabilities for our internal engineering teams, helping our engineers become more effective at shipping robust code and thinking about reliable design earlier in the lifecycle.

Our core daily technologies include AWS, Golang, CockroachDB, ElasticSearch, Redis, Flink, Kinesis, and Terraform.

Responsibilities:

  • Lead the development and continuous refinement of SRE tools and processes to improve software delivery, observability, reliability and operational efficiency. Your impact extends beyond your team’s boundary to proactively improve our overall service health.

  • Shape and communicate the technical roadmap, driving Objectives and Key Results (OKRs) to deliver strategic insights that elevate system reliability, scalability, and operational effectiveness.

  • Define and lead the technical architecture for LaunchDarkly’s reliability strategies, establishing a cohesive framework that enhances resilience, flexibility, and alignment with business goals.

  • Uplevel our engineering team to deliver their services with higher autonomy, reliability, and performance through offerings written in Go and Terraform, or delivered through existing tools.

  • Develop, communicate, and implement strategic goals and objectives for the SRE organization, ensuring alignment with company-wide initiatives and promoting long-term operational success.

  • Partner with various team members to define and mature our SRE culture through principles, technical frameworks, tooling, and processes. You will mentor and coach SRE team members and engineers in adjacent teams to promote a culture of SRE learning and growth.

  • Drive the adoption of new technologies, system designs and best practices in code health, testing, observability, and service maintainability across teams.

  • Proactively identify and resolve potential performance and scalability bottlenecks in our front-end and back-end systems and underlying infrastructure.

  • Analyze the performance of SQL queries, suggest improvements and build guardrails for teams.

Qualifications:

  • Demonstrable experience building and operating large-scale, highly available distributed systems; You also possess advanced analytical skills to anticipate and mitigate complex system behaviors and incidents before they impact our customers.

  • Solid prior experience with server-side web development (e.g., in Java / Scala, Ruby, Python, Golang, Node.js) and Infrastructure-as-Code (e.g., Terraform.)

  • Experience guiding the architectural direction and scalability considerations for new projects.

  • Strong understanding and proactive management of security practices related to SRE, coordinating with our Security team to fortify infrastructure.

  • Extensive experience working with major cloud providers, observability tooling, and RDBMS technologies is crucial for this role.

  • Experience driving alignment on decisions with cross-team impact, identifying misalignment across the team, and bringing stakeholders together to realign.

  • Strong customer focus and ability to make technical decisions that tie back to business goals.

  • Exceptional communication skills, a positive attitude, and a high degree of empathy.

Pay:

Target pay ranges based on Geographic Zones* for Levels P5:

  • Zone 1: San Francisco/Bay Area or New York City Metropolitan Area (if not Bay area specific role): $200,000 - $260,000**

  • Zone 2: Boston, DC, Irvine, LA, Monterey, Santa Barbara, Santa Rosa, Seattle: $180,000 - $235,000**

  • Zone 3: All other US locations: $170,000 - $220,000**

*Restricted Stock Units (RSUs), health, vision, and dental insurance, and mental health benefits in addition to salary.

LaunchDarkly operates from a place of high trust and transparency; we are happy to state the pay range for our open roles to best align with your needs. Exact compensation may vary based on skills, experience, degree level, and location.

Required profile

Experience

Industry :
Human Resources, Staffing & Recruiting
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Analytical Skills
  • Empathy
  • Communication

Related jobs