Site Reliability Engineer

extra holidays
Work set-up: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Strong understanding of Linux internals and OS fundamentals., Proficiency in container orchestration tools like Kubernetes and Docker., Experience with configuration management tools such as Ansible or Puppet., Knowledge of scripting languages like Python or Golang..

Key responsibilities:

  • Manage distributed infrastructure across multiple data centers.
  • Ensure service level agreements (SLAs) and perform capacity planning.
  • Implement automation and scripting to optimize operations.
  • Monitor and troubleshoot system performance and security issues.

Newfold Digital logo
Newfold Digital Large https://newfold.com
1001 - 5000 Employees
See all jobs

Job description

Who we are.

Newfold Digital is a leading web technology company serving millions of customers globally. Our customers know us through our robust portfolio of brands. We have some of the industry's most prominent and storied go-to-market brands, including Bluehost, HostGator, Domain.com, Network Solutions, Register.com and Web.com. We help customers of all sizes build a digital presence that delivers results. With our extensive product offerings and personalized support, we take pride in collaborating with our customers to serve their online presence needs. The strength of our company lives in the intersection of our people, our customer, and our brands.

We are looking for a Site Reliability Engineer – Linux, who approaches their work with passion, a hunger for 
learning and growth, and a steadfast commitment to delivering outstanding results. If you're a team player 
with a positive mindset, keen to make a meaningful impact, we encourage you to reach out to us!

What you'll do and how you'll make your mark:

  • Manage distributed infrastructure with open-source technologies across multiple datacenters

  • Ensure product SLAs, perform capacity planning, and address critical issues in a 24/7 on-call rotation.

  • Explore and implement innovative platforms as a service solution to support and enhance the efficiency of technical SRE teams.

  • Utilize data and metrics for decision-making, focusing on security and best practices.

  • Prioritize robust automation and scripting to reduce dependence on manual procedures

Who you are & what you'll need to succeed.

  • Strong understanding of Linux internals, OS fundamentals, and core network principles.

  • Basic familiarity with relational databases (PostgreSQL, MySQL) and NoSQL databases (Redis, 

  • MongoDB).

  • Proficient in container orchestration tools like OpenShift, Kubernetes, Docker Swarm, or Apache Mesos.

  • Experienced in administering and troubleshooting configuration management tools such as Puppet, 

  • Ansible Tower (AWX), or Chef.

  • Hands-on experience in load balancer administration (HAProxy, Nginx, and F5).

  • Hands-on experience with caching technologies such as Redis, Nginx+, Varnish, or Memcached.

  • Skilled in monitoring and logging stacks such as Grafana, InfluxDB, Graphite, Prometheus, ELK, and Graylog.

  • Hands-on experience with web servers like Nginx, Apache, or Tomcat.

  • Skilled in at least one scripting language such as Python, Golang 

This Job Description includes the essential job functions required to perform the job described above, as well as additional duties and responsibilities. This Job Description is not an exhaustive list of all functions that the employee performing this job may be required to perform. The Company reserves the right to revise the Job Description at any time, and to require the employee to perform functions in addition to those listed above.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Decision Making
  • Teamwork

Site Reliability Engineer (SRE) Related jobs