Cloud Site Reliability Engineer (Cloud SRE)

Work set-up: 
Full Remote
Contract: 
Experience: 
Entry-level / graduate

Offer summary

Qualifications:

Bachelor’s degree in Computer Science, Information Systems, or related field., Proficiency in English communication skills., Experience with cloud observability tools and incident management., Strong scripting skills in Python, Bash, or similar languages..

Key responsibilities:

  • Ensure the availability, performance, and scalability of cloud services.
  • Build and maintain automation for deployment, monitoring, and self-healing infrastructure.
  • Respond to incidents and participate in on-call rotations.
  • Collaborate with developers and support field IT staff to ensure reliable service delivery.

World Vision logo
World Vision Non-profit Organization - Charity Large https://www.wvi.org/
10001 Employees
See all jobs

Job description

With 75 years of experience, our focus is on helping the most vulnerable children overcome poverty and experience fullness of life. We help children of all backgrounds, even in the most dangerous places, inspired by our Christian faith.

Come join our 33,000+ staff working in nearly 100 countries and share the joy of transforming vulnerable children’s life stories!

Key Responsibilities:

IMPORTANT INFORMATION:

  • All CVs should be submitted in English.

  • This position is open to candidates based in countries where World Vision International is legally registered to operate.

JOB PURPOSE:

The Cloud Site Reliability Engineer (SRE) ensures the availability, latency, performance, and scalability of cloud services. This role combines software engineering and operations expertise to create reliable systems and improve incident response and observability.

KEY RESPONISBILITIES:

  • Preference will be given to candidates with experience in Terraform and Azure DevOps. Familiarity with GitHub Actions, Ansible, or scripting tools such as PowerShell or Python is highly desirable.

  • Security-first mindset is essential in all aspects of cloud infrastructure and operations.

  • Build and maintain automation for deployment, monitoring, and self-healing infrastructure.

  • Design service-level objectives (SLOs), indicators (SLIs), and monitoring dashboards.

  • Respond to incidents and participate in on-call rotations.

  • Continuously improve system reliability through chaos engineering, testing, and automation.

  • Collaborate with developers to design for operability and scalability.

  • Provide support, guidance, and collaboration to field office IT staff across all regions to ensure consistent service delivery and alignment with global standards.

  • Work collaboratively within Agile teams, embracing iterative delivery, continuous improvement, and adaptive planning as part of the organization's new ways of working.

KNOWLEDGE/QUALIFICATIONS FOR THE ROLE:

  • Bachelor’s degree in Computer Science, Information Systems, or a related field.

  • Demonstrated proficiency in written and verbal communication in English.

  • Knowledge of cloud observability tools and incident management practices.

  • Strong scripting or programming skills (e.g., Python, Bash, Go).

  • Problem-solving with an engineering mindset and focus on operational excellence.

  • Customer-centric—ensuring systems deliver consistent, high-quality experiences globally.

Applicant Types Accepted:

Required profile

Experience

Level of experience: Entry-level / graduate
Industry :
Non-profit Organization - Charity
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Communication
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs