Cloud Games Site Reliability Engineer L5 - Open Connect

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

5+ years of Service Reliability/Operational experience with large scale systems., Proficient in Docker and managing bare metal servers., Strong programming skills in Python, Go, Bash, or Shell., Knowledge of networking concepts and application protocols like TCP/UDP/IP, BGP, and HTTP/S..

Key responsibilities:

  • Drive improvements in resilience and quality of experience for the Cloud Gaming platform.
  • Deploy new software and hardware stacks into production.
  • Develop software solutions to address observability and reliability gaps in the game lifecycle.
  • Participate in on-call rotation and manage escalations for production issues.

Netflix logo
Netflix Leisure & Entertainment XLarge https://jobs.netflix.com
10001 Employees
See all jobs

Job description

Netflix is one of the world's leading entertainment services, with over 300 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

The Netflix Open Connect Content Delivery Network is our in-house, custom-built network and server infrastructure responsible for streaming all of your favorite movies and series.  We strive to deliver a great Netflix viewing experience in over 190 countries so our customers can watch whatever, whenever, interruption free.  

We are seeking seasoned Reliability Engineers with extensive experience in Linux, networking, data analysis, bare metal servers, Docker, and large-scale service operations to join us on our next big objective: reinventing video games globally.

Bring your passion for games and systems reliability by joining Open Connect’s Cloud Gaming Site Reliability Engineering team to design, scale, operate, automate, and analyze our globally distributed Cloud Gaming infrastructure.  Come join us and play a meaningful role in our journey to entertain the world!

Spotlight on Reliability Engineering:
Open Connect Reliability Engineering Team is responsible for end-to-end operations, availability, reliability, scalability, and the quality of experience delivered from Netflix’s Open Connect Gaming platform. We work across the Netflix Engineering organization and with external partners to improve the design and operation of our services to make them more scalable, reliable, efficient and secure.

Responsibilities:

  • Drive continual improvement in resilience, quality of experience, monitoring, instrumentation and automation with the primary goal to maintain a highly scalable and reliable Cloud Gaming platform worldwide. 

  • Deploy new software and hardware stacks into production and strengthen our canary and operational posture for new deployments 

  • Develop software solutions to observability and reliability gaps in different portions of the game lifecycle from game development, deployment, sustaining, and sunsetting of Cloud Games

  • Participate in on-call rotation and handle escalations for production issues

Qualifications:

  • 5+ years Service Reliability/Operational experience running large scale, high performance systems & internet services with a focus on performance and reliability

  • Experience with Docker and managing bare metal servers at scale

  • Experience in managing and debugging Unix/Linux systems (engineering fundamentals, networking, storage, operating systems) at scale.

  • Strong programming / Scripting capabilities in Python / Go / Bash / Shell 

  • Knowledge of networking concepts and application protocols, especially TCP/UDP/IP, BGP, HTTP/S, TURN, and DNS

  • Experience with debugging hardware/software issues for custom hardware 

  • Ability to work in a highly collaborative environment and to communicate effectively with internal and external partners

Check out:


Does this sound interesting? Or does this sound interesting but intimidating? Please don’t self-select out, let’s figure it out together. Come join us and play a meaningful role in our journey to entertain the world! We’d love to talk to you!

Netflix is a global company with a diverse member base, which is why the content we produce reflects that: global perspectives and global stories. As we grow globally, we must have the most talented employees with diverse backgrounds, cultures, perspectives, and experiences to support our innovation and creativity. We are an equal-opportunity employer and strive to build balanced teams from all walks of life.

Our culture is unique, and we tend to live by our values, so it’s worth learning more about Netflix here.

At Netflix, we carefully consider a wide range of compensation factors to determine your personal top of market. We rely on market indicators to determine compensation and consider your specific job, skills, and experience to get it right. These considerations can cause your compensation to vary and will also be dependent on your location. The overall market range for roles in this area of Netflix is typically  $100,000 - $720,000. This market range is based on total compensation (vs. only base salary), which is in line with our compensation philosophy. 


 

Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Required profile

Experience

Industry :
Leisure & Entertainment
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Communication

Site Reliability Engineer (SRE) Related jobs