Logo for Exavalu

Monitoring and Observability Architect

Key Facts

Remote From: 
Full time
English

Other Skills

  • •
    Communication
  • •
    Collaboration
  • •
    Proactivity
  • •
    Mentorship

Roles & Responsibilities

  • Hands-on with Splunk and at least two monitoring tools (Splunk, Prometheus, Grafana, ELK/EFK, Datadog, New Relic)
  • Strong scripting/programming skills (Python, Go, or Bash)
  • Deep knowledge of Kubernetes, Docker, microservices, and distributed systems
  • Experience defining SLOs/SLIs/SLAs

Requirements:

  • Build and maintain Observability pipelines and dashboards (metrics, logs, traces)
  • Define and track SLOs, SLIs, and SLAs; create actionable alerts
  • Instrument apps/infrastructure with OpenTelemetry and vendor SDKs
  • Integrate and operate tools (Splunk, Prometheus/Grafana, ELK/EFK, Datadog, New Relic)

Job description

This is a remote position.

Monitoring & Observability Architect

Role overview: Design, implement, and operate scalable observability platforms to provide metrics, logs, traces, and alerts for cloud-native, distributed systems.

 

Responsibilities

  • Build and maintain Observability pipelines and dashboards (metrics, logs, traces).
  • Define and track SLOs, SLIs, and SLAs; create actionable alerts.
  • Instrument apps/infrastructure with OpenTelemetry and vendor SDKs.
  • Integrate and operate tools (Splunk, Prometheus/Grafana, ELK/EFK, Datadog, New Relic)
  • Automate deployments with CI/CD and IaC (Terraform/CloudFormation).
  • Collaborate with dev, SRE, and platform teams; produce runbooks and post-incident reports.

 



Requirements

Required

  • Hands-on with Splunk at least two monitoring tools (Splunk, Prometheus & Grafana, ELK/EFK, Datadog, New Relic).
  • Strong scripting/programming (Python, Go, or Bash).
  • Deep knowledge of Kubernetes, Docker, microservices, and distributed systems.
  • Experience defining SLOs/SLIs/SLAs.
  • Familiar with CI/CD and IaC; experience on AWS.

 

Preferred

  • OpenTelemetry and distributed tracing experience.
  • Service mesh telemetry (Istio/Linkerd).
  • Experience in regulated environments (e.g., FDA) and telemetry cost/retention optimization.

 

Soft skills

  • Clear communicator, collaborative, proactive, and good at mentoring.


Benefits

Diversity Inclusion: At Exavalu, we are committed to building a diverse and inclusive workforce. We welcome applications for employment from all qualified candidates, regardless of race, color, gender, national or ethnic origin, age, disability, religion, sexual orientation, gender identity or any other status protected by applicable law. We nurture a culture that embraces all individuals and promotes diverse perspectives, where you can make an impact and grow your career. Exavalu also promotes flexibility  depending on the needs of employees, customers and the business. It might be part-time work, working outside normal 9-5 business hours or working remotely. We also have a welcome back program to help people get back to the mainstream after a long break due to health or family reasons.

Related jobs

Other jobs at Exavalu

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.