Match score not available

Site Reliability Engineering (SRE) Manager

79% Flex
EXTRA HOLIDAYS - FULLY FLEXIBLE
Remote: 
Full Remote
Salary: 
164 - 259K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

8+ years industry experience in SRE, Proficiency in managing distributed web infrastructures, Bachelor’s degree in Computer Science or related field.

Key responsabilities:

  • Manage and mentor a team of Automation SREs, Own technical decisions for the team, Coordinate follow-the-sun support across global time zones, Lead initiatives for critical infrastructure components, Oversee release processes and root cause analysis
NVIDIA logo
NVIDIA XLarge https://www.nvidia.com/
10001 Employees
See more NVIDIA offers

Job description

Logo Jobgether

Your missions

We are seeking a seasoned Site Reliability Engineering (SRE) Manager to lead a team of SRE staff supporting a Network Automation team in a follow-the-sun support model. The team manages critical applications and infrastructure both Cloud and on prem for datacenter deployment and automated operations. This role is pivotal in not only the maintenance of resilient, scalable systems, but ownership of the general architecture of distributed systems, including a new DNS architecture and a distributed source of truth sync.

This role goes beyond an understanding of standard best practices and operations. We’re combining technical knowledge with development chops to come up with solutions at cloud scale. This is a new team operating in an exciting, groundbreaking environment, and we want you to help us shape it.

What you will be doing:
  • Team Leadership: Manage and mentor a team of Automation SREs, fostering a culture of collaboration, innovation, and excellence in execution.
  • Technical Guidance: Own technical decisions for the team, ensuring alignment with developers and employing industry standard methodologies
  • Operational Excellence: Implement and maintain robust operational practices, including incident management, monitoring, alerting, and capacity planning
  • Shift Scheduling: Coordinate follow-the-sun support across global time zones, ensuring 24/7 coverage and efficient handovers
  • Project Management: Lead initiatives related to the design, deployment, and maintenance of critical infrastructure components
  • Release Management: Oversee release processes and ensure smooth deployments, minimizing downtime and impact on users
  • Root Cause Analysis: Conduct thorough post-incident reviews, identifying root causes and implementing preventive measures

What we need to see:
  • 8+ years of experience in the industry, with a focus on Site Reliability Engineering, with a strong background in cloud service providers, ISPs, or similar service-oriented networking companies
  • Technical Skills: Proficiency in managing distributed web infrastructures, designing scalable and resilient systems, and implementing network automation
  • Leadership: Proven track record of managing technical teams, including performance management, career development, and hiring - 2+ yrs of management experience
  • Problem Solving: Demonstrated ability to conduct detailed root cause analysis and drive improvements based on findings
  • Communication: Excellent verbal and written communication skills, with experience presenting technical information to diverse audiences
  • Education: Bachelor’s degree in Computer Science, Engineering, or a related technical field, or relevant industry experience

If you are a strategic problem solver with a passion for leading high-performance teams in a dynamic and technically challenging environment, we encourage you to apply. Join us in shaping the future of our distributed systems and network automation infrastructure.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you are creative and autonomous, we want to hear from you!

The base salary range is 164,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Go Premium: Access the World's Largest Selection of Remote Jobs!

  • Largest Inventory: Dive into the world's largest remote job inventory. More than half of these opportunities can't be found on standard platforms.
  • Personalized Matches: Our AI-driven algorithms ensure you find job listings perfectly matched to your skills and preferences.
  • Application fast-lane: Discover positions where you rank in the TOP 5% of applicants, and get personally introduced to recruiters with Jobgether.
  • Try out our Premium Benefits with a 7-Day FREE TRIAL.
    No obligations. Cancel anytime.
Upgrade to Premium

Find other similar jobs