Match score not available

Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

2+ years experience in Site Reliability Engineering, Good understanding of data stores, Experience with Kubernetes and Docker, Familiarity with Ops tools like Debian, Fluent in English.

Key responsabilities:

  • Join Zattoo's Ops/SRE team
  • Analyze and scale core services
  • Improve platforms, infrastructure, and tools
  • Support middleware team with operations
  • Increase automation in monitoring and profiling
Club GLOBALS logo
Club GLOBALS
11 - 50 Employees
See more Club GLOBALS offers

Job description

The Role

At Zattoo we are building the TV platform of the future. To make that possible, we are seeking a Site Reliability Engineer (f/m/x) to join our Operations team. As the demand for unicast TV delivery is constantly growing, we are scaling out our custom built delivery infrastructure to serve linear and non-linear video data on a multi Tbps scale. Because we control the whole chain from ingest through encoding/transcoding, to packaging and delivery there are many exciting areas to work on and to push TV to a new level.

You will play a key role in optimizing our systems architecture, monitoring and alerting. You will be working closely together with the core video, core middleware and SRE/Ops teams to ensure maximum quality of our service to our customers. If you have a strong interest in monitoring, scaling out and optimizing complex distributed systems, you can have a huge impact on site performance and network optimization at Zattoo

What you’ll do

  • Become a valued member of Zattoo’s Ops/SRE team and improve our core services
  • Analyse and understand Zattoo’s core services and explore ways on how to efficiently scale them
  • Propose improvements to platforms, infrastructure, tools and processes
  • Work closely with the middleware team and support them with the operational setup for their projects
  • Collaborate, support and consult engineers to write code that performs well and scales
  • Advocate security and stability, raise awareness for weak spots and develop plans to mitigate them
  • Increase optimization and automation of our setup in various areas such as monitoring, alerting, performance and profiling

What you’ll bring

  • 2+ years proven experience in a Site Reliability Engineering or Ops position, ideally operating a complex web based service
  • Very good understanding of Data stores (elasticsearch, cassandra, redis, memcached, mysql, database replication/failover)
  • Experience in container management such as Kubernetes, Docker or LXC w/o using cloud providers such as AWS or Google Cloud in Cloud and on-premise.
  • Experience in working with standard Ops tools, we use Debian, Nginx, Puppet, Jenkins
  • Comfortable working with remote colleagues, multidisciplinary teams and external partners
  • Fluent verbal and written English language skills

Bonus:

  • A BS/MS degree in computer science or similar discipline
  • Basic understanding of programming languages such as Bash, Python, Ruby and/or Go based web frameworks
  • Experience in Monitoring preferably working experience with Prometheus or similar technologies

Here’s what we offer you

  • Highly motivated, growing, diverse team made up of 33 different nationalities
  • Flexible work schedules and options to work from home
  • Flat hierarchies and open door policy
  • Competitive compensation and vacation entitlement
  • Social events ranging from company lunches and after work drinks to annual off site company events
  • Free yoga classes, German language classes and health checks

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
EnglishEnglish
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving
  • Collaboration
  • Communication
  • Analytical Thinking

Site Reliability Engineer (SRE) Related jobs