Match score not available

Cloud Reliability Engineer (CRE)

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Experience in Cloud Operations & Automation (AWS, Azure, GCP), Proficiency in Infrastructure as Code (IaC) tools like Terraform and Ansible, Strong expertise in observability tools such as Prometheus and Grafana, Proficiency in scripting languages like Python and Bash..

Key responsabilities:

  • Develop and implement automation scripts to streamline cloud operations.
  • Design and enhance observability frameworks for proactive issue detection.
  • Improve cloud infrastructure reliability through performance tuning and automated remediation.
  • Collaborate with development and operations teams to enforce best practices for cloud reliability.

Awign logo
Awign Information Technology & Services Scaleup https://www.awign.com/
201 - 500 Employees
See all jobs

Job description

This is a remote position.

About Awign Expert:

Awign Expert is an enterprise-focused platform that helps businesses Hire, Assess and Manage highly skilled resources for Gig Based Projects. We provide our Experts a gateway to work for and build a freelance/consulting career with large-scale Enterprises. We are a newly launched business division of Awign, which is one of the pioneers and currently the largest player in the Gig Economy in India. Here at Awign, we are changing how the world works with a vision to uplift millions of Careers.


About the client -

This company is a leading enterprise mobile app development firm, specializing in delivering highly efficient, secure, and scalable applications to a global audience. They offer end-to-end design and development services, collaborating closely with clients to build scalable, user-centric, and innovative solutions. Their skilled designers and developers create engaging user experiences while leveraging cutting-edge technologies to ensure seamless functionality.


Job Title: Cloud Reliability Engineer (CRE)   


Location: Offshore 


Job Description: 


We are seeking Cloud Reliability Engineers (CREs) to support Carnival Cruise Line cloud infrastructure. The ideal candidates will focus on automating cloud operations, improving system reliability, and ensuring seamless observability and monitoring across the Carnival Cruise Line environment. 


The CRE team will be responsible for designing, implementing, and maintaining automation frameworks, monitoring systems, and log-mining solutions to enhance cloud operations. The role will also involve provisioning, fault management (FM), and optimizing cloud infrastructure for high availability and performance. 


Key Responsibilities: 

  • Automation & Cloud Operations: Develop and implement automation scripts and tools to streamline cloud operations and provisioning. 

  • Observability & Monitoring: Design and enhance observability frameworks, including real-time monitoring, log mining, and alerting systems for proactive issue detection. 

  • Infrastructure Reliability: Improve cloud infrastructure reliability through performance tuning, capacity planning, and automated remediation strategies. 

  • Fault Management (FM): Implement fault management processes to detect, diagnose, and resolve cloud infrastructure issues efficiently. 

  • Data Farms & Log Analysis: Leverage data analytics and log mining techniques to gain insights into system performance and troubleshoot anomalies. 

  • Provisioning & Deployment: Automate cloud provisioning and infrastructure-as-code (IaC) practices for efficient deployment across Carnival Cruise Lines' brands. 

  • Collaboration & Best Practices: Work closely with development, security, and operations teams to enforce best practices for cloud reliability and scalability. 


Required Skills & Experience: 

  • Experience in Cloud Operations & Automation (AWS, Azure and GCP) 

  • Proficiency in Infrastructure as Code (IaC) (Terraform,  Azure CloudFormation, Ansible, Chef, Puppet, Azure Resource Manager) 

  • Strong expertise in observability tools (Prometheus, Grafana, ELK Stack, Splunk, or Datadog) 

  • Log Mining & Data Analytics (Kibana, Splunk, or BigQuery) 

  • Fault Management & Incident Response experience in cloud environments 

  • Experience with containerized environments (Docker, Kubernetes) 

  • Proficiency in scripting & automation (Python, Bash, PowerShell) 

  • Understanding of cloud security, networking, and cost optimization 



Preferred Qualifications: 

  • Certifications in Cloud Technologies (AWS Certified DevOps Engineer, Azure DevOps, Google Cloud Professional DevOps Engineer) 

  • Experience in hybrid cloud environments (on-prem & cloud integration) 

  • Hands-on experience with Site Reliability Engineering (SRE) practices 

  • Experience in managing large-scale cloud infrastructure for enterprises 



Need the following along with profiles -
  • Candidate Name
  • Total Experience
  • Exp in CRE
  • Exp in AWS
  • Exp in Azure
  • Exp in GCP
  • Exp in Terraform
  • Exp in Splunk
  • Exp in Docker
  • Exp in Kubernetes
  • Exp in Python

Required profile

Experience

Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Cloud Engineer Related jobs