Match score not available

Blockchain Head of Site Reliability Engineering / Head of Infrastructure

Remote: 
Full Remote
Contract: 
Experience: 
Expert & Leadership (>10 years)
Work from: 

Offer summary

Qualifications:

10+ years in SRE or infrastructure engineering, 5+ in leadership, Experience managing large-scale cloud systems (AWS, GCP, Azure), Strong skills in automation (Terraform, Ansible) and scripting (Python, Bash), Expertise in Docker, Kubernetes, and network infrastructure, Strong knowledge of CI/CD pipelines, incident management, and security practices.

Key responsabilities:

  • Lead infrastructure design, ensuring high availability and scalability
  • Build and mentor a global SRE team with 24/7 support
  • Develop SLAs for uptime and performance, focusing on automation
  • Implement strategies for monitoring, incident response, and rapid recovery
  • Collaborate with engineering teams on scalable architecture and processes
CryptoRecruit logo
CryptoRecruit Human Resources, Staffing & Recruiting SME https://www.cryptorecruit.com/
11 - 50 Employees
See more CryptoRecruit offers

Job description

We are looking for a Head of SRE to lead the design and management of a distributed infrastructure project. This role involves building a system from scratch, overseeing deployment, maintenance, and uptime, while growing a global SRE team. The candidate will focus on scaling infrastructure, automation, and performance optimization across regions like APAC and LATAM, fostering a culture of improvement and excellence.

Responsibilities

  • Lead infrastructure design, ensuring high availability and scalability
  • Build and mentor a global SRE team with 24/7 support
  • Develop SLAs for uptime and performance, focusing on automation
  • Implement strategies for monitoring, incident response, and rapid recovery
  • Collaborate with engineering teams on scalable architecture and processes
  • Oversee security best practices and compliance
  • Manage tools for infrastructure automation and incident management
  • Ensure cost-effective vendor management and comprehensive documentation

Qualifications:

  • 10+ years in SRE or infrastructure engineering, 5+ in leadership
  • Experience managing large-scale cloud systems (AWS, GCP, Azure)
  • Strong skills in automation (Terraform, Ansible) and scripting (Python, Bash)
  • Expertise in Docker, Kubernetes, and network infrastructure
  • Proven ability to meet SLAs and manage global teams
  • Strong knowledge of CI/CD pipelines, incident management, and security practices
  • Leadership, communication, and project management skills

Bonus Skills:

  • Experience with decentralized or distributed systems
  • Familiarity with observability tools (OpenTelemetry, Jaeger)
  • Multi-cloud and hybrid cloud knowledge
  • AWS, GCP, or Azure certifications
  • Understanding of security frameworks (SOC 2, ISO 27001) and agile environments

Required profile

Experience

Level of experience: Expert & Leadership (>10 years)
Industry :
Human Resources, Staffing & Recruiting
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Communication
  • Leadership
  • Collaboration
  • Mentorship

Site Reliability Engineer (SRE) Related jobs