Match score not available

Senior Cloud Site Reliability Engineer

unlimited holidays - extra holidays - extra parental leave - long remote period allowed
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Prior experience with Kubernetes, AWS, and working as a Site Reliability Engineer, Strong programming skills in Go, Python, Java, Ruby, or Bash, Familiarity with monitoring metrics, alert tooling, and DevOps tools like Jenkins, Ansible, Chef, Deep understanding of Agile, Continuous Delivery, SDLC, and Linux command line.

Key responsabilities:

  • Enhance operational reliability, efficiency, and automation through process improvements
  • Collaborate with engineering teams to address outages and enhance site optimization
  • Contribute to Groupon's dev tools development for high scale microservices in the cloud
  • Write quality code, perform system-level analysis, and participate in on-call rotation
Groupon logo
Groupon Information Technology & Services Large https://www.grouponcareers.com/
1001 - 5000 Employees
See more Groupon offers

Job description

Become a better engineer by building things that help local businesses around the world thrive

 

You became an engineer because you believed in technology’s ability to make a difference in the world. So why would you spend your days building things that don’t matter? At Groupon, we spend our days developing tools, platforms and experiences that help small businesses thrive in their local communities. We may look like an ordinary e-commerce app, but under the surface we’re using cutting edge technology to build products that regularly positively impact the lives of 48M people and 100,000 merchants. Of course, local merchants aren’t the only ones who will benefit from your work—you will too, as will the engineering teams that become our customers.

We are looking for great software engineers excited by helping us build out Groupon’s Site Reliability Engineering teams. We use technologies like Kubernetes, and AWS EKS. We build our automation and tooling using common languages like Go, Python, Ruby, and shell. We need our engineers to have a passion for growth and learning, to be excited to use these technologies and tools, and to be ready to develop and evolve the techniques and procedures that will ensure site reliability.

 

We want you to be part of the team that delivers the next generation site reliability platform, automation, and its toolset for Groupon Engineering. We think you’ll agree that it’s an exciting challenge and a really great team to be part of.

 

We are providing Groupon’s microservice engineering teams a solid underpinning of tools and practices in the areas of reliability, monitoring,  alerting, and automation. A measure of success is that engineering teams can focus more on delivering new features than they do thinking about how to get those features into production. Another measure of success in this role is that we see a reduction in time spent by teams bringing systems online after issues and getting to root causes, thereby allowing them to focus on higher-level tasks around site optimization and new features.  

 

You will:

  • Identify process gaps and implement process improvements to increase operational reliability and cloud cost efficiency

  • Drive standardization efforts across the services, infrastructure, systems and practices

  • Improve operational efficiency through automation

  • Develop effective alerts and tooling to quickly identify and address reliability risks

  • Design and develop Groupon’s dev tools, to support continuous delivery in a high scale microservices environment in the cloud

  • Engage with engineering teams to triage outages and carry forward action items to improve reliability

  • Participate in on-call rotation to support other teams’ first responders.

  • Write great quality code using SOLID principles including unit and integration tests. The languages we like to use are Go, Python, Java, Ruby, and Bash.

  • Promote and foster an open source/inner source culture at Groupon

 

We're excited about you if you have :

  • Good knowledge or strong interest of at least 2 of the following: Kubernetes, AWS, PaaS, IaaS

  • Working knowledge of monitoring metrics and alert tooling

  • A keen interest in tool and platform development for dev teams

  • Good knowledge of toolchains such as Jenkins, Ansible, Maven, Chef, Salt, Puppet, ELK etc.

  • Excellent programming skills using one of the following: Go, Python, Java, Ruby 

  • Deep understanding of Agile and Continuous Delivery concepts and tools

  • Knowledge of the Linux command line and system-level analysis

  • Strong knowledge of all aspects of the Software Development Life Cycle (SDLC)

 

Preferred:

  • AWS Certified Solutions Architect Associate or AWS Certified DevOps Engineer Associate

  • Experience managing production Kubernetes clusters

  • Learned Kubernetes through diagnosing failures and high scale challenges

  • 3+ years of relevant experience, out of which 2+ years of experience as a Site Reliability Engineer

 

We value engineers who are:

  • Customer-focused: We believe that doing what’s right for the customer is ultimately what will drive our business forward.

  • Concerned with Quality: Have standards, do things the right way, avoid repetition  

  • Team players: You believe that more can be achieved together. You listen to feedback and also provide supportive feedback to help others grow/improve.

  • Mindful: You maintain a healthy work-life balance and encourage others to.

  • Pragmatic: We do things quickly to learn what our customers desire. You know when it’s appropriate to take shortcuts that don’t sacrifice quality or maintainability.

  • Learners: Growth mindset, relish challenges, grow and learn from them.

  • Owners: Engineers at Groupon know how to positively impact the business.

Groupon’s purpose is to build strong communities through thriving small businesses. To learn more about the world’s largest local ecommerce marketplace, click here for the latest Groupon news. Plus, be sure to check out the values that shape our culture, guide our strategy and make our company a great place to work. And just don’t take our word for it. Hear from real Groupon team members and learn more about our inclusive employee groups. If all of this sounds like something that’s a great fit for you, then click apply and let’s see where this takes us.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Site Reliability Engineer (SRE) Related jobs