Logo for Compass.uol

SRE | PL (Remote)

Roles & Responsibilities

  • Strong knowledge of Datadog for monitoring, metrics, alerts and performance analysis
  • Availability to work on emergency incidents outside business hours according to the defined on-call schedule
  • Experience with CI/CD pipelines
  • Experience defining and tracking reliability indicators such as SLOs, SLIs and SLAs

Requirements:

  • Ensure reliability, availability, and performance of production systems and applications; implement and maintain monitoring, observability, and alerting with metrics, logs, and traces (Datadog); define and report SLOs, SLIs, and SLAs
  • Lead incident response for critical incidents, including on-call, ensuring rapid service recovery and root cause analysis
  • Develop and improve automation for deployment, monitoring, scalability, and fault recovery; collaborate with development teams to build resilient and scalable systems
  • Support CI/CD pipelines and infrastructure as code; document procedures, runbooks, and operational standards; promote continuous improvement and reduce operational risks

Job description

JOB DESCRIPTION


.


RESPONSIBILITIES AND ASSIGNMENTS


  • Atuar na garantia da confiabilidade, disponibilidade e desempenho de sistemas e aplicações em produção;
  • Implementar e manter soluções de monitoramento, observabilidade e alertas, com foco em Datadog, acompanhando métricas, logs e traces;
  • Definir, acompanhar e reportar indicadores de confiabilidade como SLOs, SLIs e SLAs;
  • Atuar na resposta a incidentes críticos, inclusive em regime de sobreaviso, assegurando rápida recuperação dos serviços e conduzindo análises de causa raiz;
  • Desenvolver e aprimorar automações para deploy, monitoramento, escalabilidade e recuperação de falhas;
  • Colaborar com equipes de desenvolvimento na construção de sistemas resilientes e escaláveis;
  • Apoiar pipelines de CI/CD, infraestrutura como código e boas práticas operacionais;
  • Documentar procedimentos, runbooks e padrões de operação, promovendo melhoria contínua e redução de riscos operacionais;
  • Apoiar a sustentação e evolução da plataforma, atuando diretamente no monitoramento dos ambientes, resposta a incidentes críticos e implementação de melhorias contínuas para redução de falhas recorrentes.
  • É fundamental possuir conhecimento sólido em Datadog para monitoramento, métricas, alertas e análise de performance, bem como disponibilidade para atuação em atendimentos emergenciais fora do horário comercial, conforme escala definida.

REQUIREMENTS AND QUALIFICATIONS


  • É fundamental possuir conhecimento sólido em Datadog para monitoramento, métricas, alertas e análise de performance;
  • Disponibilidade para atuação em atendimentos emergenciais fora do horário comercial, conforme escala definida;
  • Experiência com pipelines de CI/CD;
  • Experiência com a definição e acompanhamento de indicadores de confiabilidade como SLOs, SLIs e SLAs;

Become a Compasser, be part of AI/R.


Compass UOL is a global firm and part of the AI Revolution Company, together transforming organizations using Artificial Intelligence, Generative AI, and other of today’s most advanced technologies.


We equip our team with proprietary and external AI-driven tools to design and build digital-native platforms, integrating cutting-edge technologies and enabling companies to innovate, transform their businesses, and drive success in their markets.

To achieve this, we attract and develop the best talent, creating opportunities that enhance people’s lives and highlight the positive impact of disruptive technologies.

We empower borderless talent and promote knowledge and opportunities in the latest market trends, driving significant personal and professional growth.

Join us and be part of the AI-driven revolution.


Related jobs

Other jobs at Compass.uol

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.