Match score not available

Site Reliability Engineer

Remote:

Full Remote

Contract:

Full time

Experience:

Mid-level (2-5 years)

Work from:

Germany

Offer summary

Qualifications:

2+ years experience in Site Reliability Engineering, Good understanding of data stores, Experience with Kubernetes and Docker, Familiarity with Ops tools like Debian, Fluent in English.

Key responsabilities:

Join Zattoo's Ops/SRE team
Analyze and scale core services
Improve platforms, infrastructure, and tools
Support middleware team with operations
Increase automation in monitoring and profiling

Club GLOBALS

11 - 50 Employees

See more Club GLOBALS offers

Job description

The Role

At Zattoo we are building the TV platform of the future. To make that possible, we are seeking a Site Reliability Engineer (f/m/x) to join our Operations team. As the demand for unicast TV delivery is constantly growing, we are scaling out our custom built delivery infrastructure to serve linear and non-linear video data on a multi Tbps scale. Because we control the whole chain from ingest through encoding/transcoding, to packaging and delivery there are many exciting areas to work on and to push TV to a new level.

You will play a key role in optimizing our systems architecture, monitoring and alerting. You will be working closely together with the core video, core middleware and SRE/Ops teams to ensure maximum quality of our service to our customers. If you have a strong interest in monitoring, scaling out and optimizing complex distributed systems, you can have a huge impact on site performance and network optimization at Zattoo

What you’ll do

Become a valued member of Zattoo’s Ops/SRE team and improve our core services
Analyse and understand Zattoo’s core services and explore ways on how to efficiently scale them
Propose improvements to platforms, infrastructure, tools and processes
Work closely with the middleware team and support them with the operational setup for their projects
Collaborate, support and consult engineers to write code that performs well and scales
Advocate security and stability, raise awareness for weak spots and develop plans to mitigate them
Increase optimization and automation of our setup in various areas such as monitoring, alerting, performance and profiling

What you’ll bring

2+ years proven experience in a Site Reliability Engineering or Ops position, ideally operating a complex web based service
Very good understanding of Data stores (elasticsearch, cassandra, redis, memcached, mysql, database replication/failover)
Experience in container management such as Kubernetes, Docker or LXC w/o using cloud providers such as AWS or Google Cloud in Cloud and on-premise.
Experience in working with standard Ops tools, we use Debian, Nginx, Puppet, Jenkins
Comfortable working with remote colleagues, multidisciplinary teams and external partners
Fluent verbal and written English language skills

Bonus:

A BS/MS degree in computer science or similar discipline
Basic understanding of programming languages such as Bash, Python, Ruby and/or Go based web frameworks
Experience in Monitoring preferably working experience with Prometheus or similar technologies

Here’s what we offer you

Highly motivated, growing, diverse team made up of 33 different nationalities
Flexible work schedules and options to work from home
Flat hierarchies and open door policy
Competitive compensation and vacation entitlement
Social events ranging from company lunches and after work drinks to annual off site company events
Free yoga classes, German language classes and health checks