Match score not available

Sr Site Reliability Engineer

extra holidays - extra parental leave
Remote: 
Full Remote
Contract: 
Salary: 
124 - 155K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Minimum of 5 years SRE experience, Hands-on experience with AWS technologies, Knowledge of programming with Python, Golang, Experience with Ansible and automation tools, Familiarity with security and DevSecOps platforms.

Key responsabilities:

  • Design, deploy, and manage automation tools
  • Collaborate with product teams for system optimization
  • Ensure application performance and reliability
  • Act as primary contact during major incidents
  • Mentor engineers and support their growth
McGraw Hill logo
McGraw Hill Edtech: Education + Technology Large https://www.mheducation.com/
1001 - 5000 Employees
See more McGraw Hill offers

Job description

Overview

Impact the Moment

At McGraw Hill we create best-in-class, next-generation learning platforms that are used by millions of students and educators worldwide every day. We design intuitive and effective tools and experiences that maximize teachers’ time and students’ learning. And we do all of this in a supportive and collaborative environment where we work alongside brilliant colleagues, touch lives around the world, see the difference our hard work makes, and continue our paths of lifelong learning.

Your impact on team

As a Sr Site Reliability Engineer at McGraw Hill, you will play a crucial role in designing and maintaining high-capacity systems that ensure the reliability, performance, and security of our customer platforms. You will collaborate with product teams within a DevOps framework to implement automation tools and processes that enhance predictability, accelerate time-to-market, and optimize costs. Your efforts will directly contribute to operational excellence and help advance our mission to deliver exceptional, reliable services.

This is a remote position open to applicants authorized to work for any employer within the United States.

What You’ll Do

Cloud Engineering

  • Design, deploy, and manage automation tools in a DevOps model to enhance predictability, accelerate time-to-market, and ensure repeatability, traceability, and transparency of infrastructure automation (infrastructure-as-code, monitoring-as-code).
  • Collaborate with product development teams to optimize systems for reliability and performance, while managing AWS costs and using optimization tools to maximize ROI and meet Service Level Objectives.
  • Continuously learn and stay updated on the AWS ecosystem through participation in game day scenarios, professional conferences, and other development opportunities.

Observability Engineering

  • Ownership of the reliability, uptime, system security, cost, capacity, resiliency, and performance of applications and platforms, while leading data-driven initiatives to enhance stability and improve service levels.
  • Ensure that the architecture and deployment models are adequately designed to meet SLA commitments
  • Act as the primary contact during major incidents, resolving issues and managing on-call alarms.
  • Maintain and enhance telemetry systems to improve visibility into application performance and business metrics, ensuring operational workloads are effectively managed

DevSecOps

  • Support healthy software development practices, including complying with agile software development methodology, building standards for code reviews, work packaging, and continuous delivery
  • Partner with CyberSecurity and develop plans and automation to respond to new risks and vulnerabilities

Resiliency Engineering

  • Collaborate with development teams to identify system failure points and blast radius, validate monitoring and observability configurations, coordinate failure injection testing, and document steady-state production levels and growth patterns.
  • Plan and forecast for seasonal growth, communicate trends with leadership, and enhance infrastructure scaling plans to handle 2x the anticipated load, while coordinating improvements to software and infrastructure to meet resiliency goals.
  • Mentor and nurture engineers across varying levels of experience; foster growth by setting high-reaching goals and providing support to achieve them.

About You

  • Minimum of 5 years of applicable Site Reliability Engineering (SRE) experience.
  • Hands-on experience with following technologies is required:
    • Cloud and Infrastructure as a Code: AWS (CloudFront, S3, EC2, ECS, SES, SQS, SNS, Load Balancing, VPC, Config, Systems Manager, Lambda, API Gateway, DB services) and Terraform
    • Programming and Containerization: Python, Golang, Bash, Ansible, and AWS ECS
    • Security and web platforms: Rapid7, WAF, Apache httpd, Apache Tomcat, Angular
    • Config Management and provisioning: Ansible, Packer
    • Telemetry: NewRelic, CloudWatch, DataDog
    • DevSecOps: Artifactory, Jenkins, CircleCI, SonarQube, Jfrog X-Ray, Control Tower, GitHub
  • Experience with Automation tools and software development is a bonus

Why McGraw Hill?

There has never been a better time to join McGraw Hill. In our culture of curiosity and innovation, you will be able to own your growth and develop as we do.

The pay range for this position is between $124,350 - $155,000 annually, however, base pay offered may vary depending on job-related knowledge, skills, experience, and location. An annual bonus plan may be provided as part of the compensation package, in addition to a full range of medical and/or other benefits, depending on the position offered. Click here to learn more about our benefit offerings.

McGraw Hill recruiters always use a “@mheducation.com” email address and/or from our Applicant Tracking System, iCIMS. Any variation of this email domain should be considered suspicious. Additionally, McGraw Hill recruiters and authorized representatives will never request sensitive information in email.

47819

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Edtech: Education + Technology
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving
  • Collaboration
  • Mentorship

Site Reliability Engineer (SRE) Related jobs