Job description

Job Category: Site Reliability Engineer

Job Location: LATAM

We are looking for a mission-driven Site Reliability Engineer to support and scale the infrastructure powering our secure, mission-critical SaaS platform. Our architecture spans traditional Windows-based .NET/IIS apps and modern cloud-native services using AWS, Docker, Kubernetes and Terraform. You’ll play a key role in ensuring uptime, reliability, and operational excellence across a hybrid stack.
You must be confident in operating and debugging both modern infrastructure (cloud native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to respond to incidents quickly, support ongoing automation, and scale systems reliably.

Key Responsibilities:
● Be part of the team that owns the uptime and performance of our core backend infrastructure (Windows + Linux)
● Maintain and enhance observability across systems using Kibana, CloudWatch, and custom telemetry.
● Manage CI/CD pipelines, infrastructure as code (Terraform, Ansible), and deployment automation.
● Support and maintain production Windows environments:
– .NET Framework/Core apps running in IIS.
– SQL Server with AlwaysOn replication and Service Broker-based messaging.
● Support and operate cloud-native services:
– AWS Lambdas, DynamoDB, Postgres/Aurora, Redshift, Redis, and containerized workloads in Docker.
● Participate in on-call rotation and incident response.
● Collaborate closely with engineering teams to improve system reliability and deployment workflows.

What We’re Looking For:
● 5+ years of SRE, DevOps, or WebOps experience supporting production SaaS systems.
● Strong experience with Windows Server, IIS, and .NET applications in production.
● Hands-on experience with SQL Server administration, including AlwaysOn and Service Broker.
● Proficiency in AWS operations, including Lambda, DynamoDB, CloudWatch, and IAM.
● Familiarity with Postgres, Redis, Kibana/ElasticSearch, and centralized logging.
● Experience with Docker, Terraform, and Ansible for infrastructure management.
● Strong scripting skills (PowerShell, Python)
● Experience running and debugging containerized and distributed systems in production.
● Excellent incident response and debugging skills.

Great, just keep talking to your recruiter.

ID 4106 – Site Reliability Engineer

Key Facts

Hard Skills

Job description

Site Reliability Engineer (SRE) Related jobs

Site Reliability Engineer

Senior Site Reliability Engineer (SRE)

Site Reliability Engineer

Site Reliability Engineer

Senior Site Reliability Engineer, Government

Other jobs at CONEXIONHR - Recruiting Company

ID 4066 – Desarrollador Python – Django

ID 4121 – Senior Statistician I

ID 4115 – Salesperson

We help you get seen. Not ignored.

Auto-Apply

AI Match Feedback