Match score not available

Senior Platform Reliability Engineer

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's degree in Computer Science or related field, Proven experience as a Site Reliability Engineer, Extensive experience with AWS services, Strong proficiency in Terraform, Proficient in Python or Bash scripting.

Key responsabilities:

  • Collaborate to design, build, and maintain cloud infrastructure
  • Develop monitoring solutions and manage incidents
  • Optimize system performance, reliability, and costs
  • Implement automation tools and CI/CD pipelines
  • Ensure security compliance and documentation
Luupli logo
Luupli
2 - 10 Employees
See more Luupli offers

Job description

Logo Jobgether

Your missions

Job Title: Site Reliability Platform Engineer

About Luupli

Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and the planet. Our team is made up of passionate and dedicated individuals who are committed to making Luupli a success.

Role Description

We are seeking a talented and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure and services, primarily hosted on AWS. If you have a passion for problem-solving, a deep understanding of AWS services, hands-on experience with Terraform, and proficiency in scripting with Python or Bash, we invite you to apply for this exciting opportunity.

Role And Responsibilities

  • Infrastructure Design and Automation:
  • Collaborate with software engineering and operations teams to design, build, and maintain cloud-based infrastructure using AWS and Terraform.
  • Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure components.
  • Monitoring and Incident Management:
  • Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues.
  • Participate in incident response and root cause analysis efforts to drive continuous improvement and prevent future incidents.
  • Reliability and Performance Optimization:
  • Optimise system performance, reliability, and cost efficiency through continuous monitoring, performance tuning, and capacity planning.
  • Identify opportunities to automate manual processes and improve system resilience.
  • Scripting and Automation:
  • Utilise Python or Bash scripting to create and maintain automation tools for various operational tasks and deployments.
  • Implement and improve continuous integration and continuous deployment (CI/CD) pipelines.
  • Security and Compliance:
  • Collaborate with security teams to implement best practices for securing cloud infrastructure and services.
  • Ensure compliance with relevant industry standards and regulations.
  • Deployment and Release Management:
  • Support CI/CD pipelines for application deployments and updates.
  • Contribute to the design and implementation of deployment strategies that promote zero-downtime releases.
  • Documentation and Knowledge Sharing:
  • Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution procedures.
  • Participate in knowledge sharing with team members to enhance overall expertise and skill sets.

Requirements

  • Education and Experience:
  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
  • Proven experience as a Site Reliability Engineer or similar role.
  • Technical Skills:
  • Extensive experience with Amazon Web Services (AWS) and its core services (EC2, S3, RDS, IAM, etc.).
  • Strong proficiency in infrastructure-as-code (IaC) tools, with a focus on Terraform.
  • Proficient in scripting with Python or Bash for automation and operational tasks.
  • Solid understanding of networking principles and protocols.
  • Knowledge of CI/CD pipelines and related tools.
  • Problem-Solving and Analytical Abilities:
  • Ability to diagnose and resolve complex technical issues in a fast-paced environment.
  • Analytical mindset to proactively identify potential system weaknesses and performance bottlenecks.
  • Collaboration and Communication:
  • Strong teamwork and collaboration skills to work effectively with cross-functional teams.
  • Excellent verbal and written communication skills.

Compensation

This is an equity-only position, offering a unique opportunity to gain a stake in a rapidly growing company and contribute directly to its success.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
Check out the description to know which languages are mandatory.

Hard Skills

Soft Skills

  • communication
  • Analytical Thinking
  • collaboration
  • Problem Solving

Site Reliability Engineer Related jobs