Match score not available

Site Reliability Engineer Leader at Infotree Global Solutions

Remote: 
Full Remote
Work from: 

Offer summary

Qualifications:

Information Technology degree or similar, Hands-on experience with AWS cloud, Knowledge of automation CI/CD tools and scripting cloud workloads, Familiarity with monitoring tools.

Key responsabilities:

  • Supervise SRE team and report performance metrics
  • Proactively respond to failures and build failover workflows
  • Implement observability and monitoring stack for infrastructure layers
  • Improve high availability and scalability of solutions
  • Manage application downtime and design backup strategies
Infotree Global Solutions logo
Infotree Global Solutions Large https://www.infotreeglobal.com/
1001 - 5000 Employees
See more Infotree Global Solutions offers

Job description

Product: Global Platform Engineering.

Your role:

• Supervise a team of Site Reliability Engineers
• Report metrics on application performance and incidents
• Act proactively and responsively to infrastructure and application failures
• Build and automate failover and recovery workflows
• Implement observability and monitoring stack for infrastructure and application layers
• Improve high availability and scalability for existing solutions
• Manage application downtime by defining and measuring SLAs and Error Budgets
• Design backup and recovery strategies

Your background:

• You have an Information Technology degree or similar
• You have a hands-on experience with AWS cloud
• You know automation CI/CD tools (Jenkins, Github or similar)
• You know how to automate and script cloud workloads with IaaC and CaaC techniques (Terraform, CloudFormation,
Ansible, Helm)
• You know monitoring tools (Datadog, Prometheus, Grafana, Splunk, or similar)

Required profile

Experience

Spoken language(s):
Check out the description to know which languages are mandatory.

Other Skills

  • Proactivity
  • Leadership

Site Reliability Engineer Related jobs