Logo for Talentus

Site Reliability Engineer (SRE)

Roles & Responsibilities

  • 4-6 years of experience in Site Reliability Engineering, DevOps, or related roles
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, or cloud-native tools)
  • Familiarity with cloud platforms (Azure, AWS, or GCP) and containerization/orchestration (Docker, Kubernetes)
  • Strong scripting/programming skills (Python, Go, Bash) and understanding of CI/CD pipelines

Requirements:

  • Ensure high availability, reliability, and performance of applications and infrastructure
  • Define and monitor SLIs, SLOs, and SLAs to maintain service reliability
  • Implement automation to reduce manual operations and improve system efficiency
  • Lead incident management, root cause analysis (RCA), and post-mortem processes

Job description

At Talentus Global, we are looking for you!
 
We are a U.S. company with a strong presence in LATAM and across 20+ countries around the world. Some of our key near-shore BPO services include: smart-sourcing, dedicated or cluster teams, managed IT services, software outsourcing, and top ERP & CRM solutions—driven by our practices across many industries, including Higher Education.
 
We are currently looking for a Site Reliability Engineer (SRE), to become a valuable addition to our dynamic team!
 

Responsibilities:

  • Ensure high availability, reliability, and performance of applications and infrastructure.
  • Define and monitor SLIs, SLOs, and SLAs to maintain service reliability.
  • Implement automation to reduce manual operations and improve system efficiency.
  • Monitor systems, detect anomalies, and respond to incidents in a timely manner.
  • Lead incident management, root cause analysis (RCA), and post-mortem processes.
  • Collaborate with development and DevOps teams to improve system resilience and scalability.
  • Manage observability tools (monitoring, logging, tracing) to gain system insights.
  • Optimize system performance, capacity planning, and cost efficiency.
  • Implement reliability best practices, including redundancy, failover, and disaster recovery.
  • Continuously improve system reliability through proactive engineering initiatives.

Qualifications:

  • 4 to 6 years of experience in Site Reliability Engineering, DevOps, or related roles.
  • Strong understanding of system reliability, scalability, and performance engineering.
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, or cloud-native tools).
  • Familiarity with cloud platforms such as Azure, AWS, or GCP.
  • Experience with scripting or programming languages ( Python, Go, Bash).
  • Knowledge of CI/CD pipelines and DevOps practices.
  • Experience with containerization and orchestration tools (Docker, Kubernetes).
  • Strong troubleshooting and incident management skills.
  • Understanding of networking, distributed systems, and system architecture.
  • Experience working in Agile/Scrum environments.
  • Advanced English proficiency skills (C1) required.
  • Must have experience working for US clients
 
 
What do we offer?
· Contractor model 
· Remote model
· Salary in $USD 
· Paid Vacations 
· Day off for birthdays  
· Benefits courses and/or certifications 
-Opportunity to work with top-tier U.S. clients.
-Entrepreneurial, multicultural team culture.
 
 
 
Join us if you have what it takes to be part of the Talentus Global Team!
 

Site Reliability Engineer (SRE) Related jobs

Other jobs at Talentus

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.