Match score not available

Site Reliability Engineer

extra holidays - extra parental leave
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Bachelor's degree required, Minimum 3+ years in a technical role, Experience with Ansible, Puppet, or Chef preferred, Azure experience required, Strong understanding of system administration principles.

Key responsabilities:

  • Ensure reliability and performance of .NET applications
  • Implement infrastructure automation tools using IaC
  • Monitor application health and resolve issues
  • Establish best practices for product architecture
  • Collaborate on disaster recovery plans
SGS  logo
SGS Wholesale Large https://www.sgs.com/
10001 Employees
See more SGS offers

Job description

Company Description

We are SGS – the world's leading testing, inspection and certification company. We are recognized as the global benchmark for sustainability, quality and integrity. Our 99,600 employees operate a network of 2,600 offices and laboratories, working together to enable a better, safer and more interconnected world.

Job Description

The Site Reliability Engineer will play a critical part in ensuring the reliability, supportability, scalability, and performance of our .NET stack applications built with ASP.NET MVC, Angular, and Web API.

  • Partner with developers and product operations teams to understand application requirements and translate them into operational practices.
  • Design, implement, and maintain infrastructure automation tools using Infrastructure as Code (IaC) methodologies.
  • Monitor application health and performance metrics, proactively identifying and resolving potential issues.
  • Implement incident response procedures to ensure timely resolution of outages and service disruptions.
  • Establish and improve best practices for product solution design / architecture, and development.
  • Participate in peer and team code reviews by developing comprehensive coding standards and guidelines to ensure consistency, maintainability, and quality in software development. By establishing clear protocols for code formatting, naming conventions, error handling, testing, and documentation, we can enhance code readability, reduce defects, and facilitate knowledge sharing among team members.
  • Collaborate with engineers to develop and implement disaster recovery plans.
  • Continuously improve monitoring and alerting processes to ensure efficient problem identification and resolution.
  • Stay up-to-date on the latest advancements in .NET infrastructure and SRE best practices.

Qualifications
  • Bachelor degree required
  • Minimum 3+ years of experience in a related technical role (e.g., Systems Administrator, Network Engineer) required
  • Experience with configuration management tools like Ansible, Puppet, or Chef preferred
  • Azure experience required
  • Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana) is a plus preferred
  • Ability to manage and coordinate multiple projects in a fast paced, highly professional environment.
  • While coding proficiency is not required, a strong understanding of the .NET ecosystem and a desire to delve into infrastructure and automation will be essential for success.
  • Strong understanding of system administration principles, including operating systems (Windows Server preferred) and networking concepts.
  • Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana)
  • Ability to work independently and as part of a team

Additional Information

SGS Canada is an equal opportunity employer and we are committed to achieving greater accessibility by providing accommodation for people with disabilities during our hiring process.  Accommodations are available on request for qualified candidates during each stage of the recruitment process.

Please note that candidates applying for Canadian job openings should be authorized to work in Canada.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Wholesale
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs