Offer summary

Qualifications:

5–8 years of experience in SRE, Observability, or Reliability roles., Proficiency with observability tools like Grafana, Prometheus, OpenTelemetry, ELK., Hands-on experience with tracing, profiling tools, and distributed systems., Strong programming skills, preferably in C# or Go..

Key responsibilities:

Own the end-to-end observability and reliability strategy across all product lines.

Define and maintain an observability framework and ensure coverage for all services.

Build and manage alerting rules, incident management, and RCA processes.

Collaborate with teams to automate anomaly detection and promote observability best practices.

Job description

About Flinks 🚀

Flinks is where financial data moves—with purpose, trust, and impact.
We’re on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial products and experiences. Since 2016, we’ve been bridging the gap between fintechs, financial institutions, and consumers by enabling seamless, secure data connectivity.
From instant account funding to smarter lending, our solutions help power some of the most innovative financial products in North America. We partner with lenders, banks, and fintechs to streamline onboarding, prevent fraud, and fuel realtime decisionmaking with enriched, reliable data.
As pioneers in Canada’s open banking movement, were not waiting for the future—were building it. If youre bold, curious, and ready to help shape the future of finance, we’d love to meet you.

What Youll Be Doing 🔥

As the Observability SRE, you will own the endtoend observability, monitoring, and reliability strategy across all Flinks product lines. Your mission is to ensure every product—Data Connectivity, Payments, Enrichment, and Document Services—has the right telemetry, actionable alerts, and reliability insights.

Companywide Observability & Monitoring: Define and maintain an observability framework across products; ensure coverage for APIs, scraping systems, payments, enrichment, and document services; establish SLIsSLOs aligned to client expectations.
Alerting & Incident Management: Build consistent, lownoise alerting rules; integrate observability into Incident.io workflows; lead crossproduct RCA; maintain a “single source of truth” for reliability metrics.
Reliability Analysis & Insights: Deliver monthlyquarterly scorecards linking reliability to client outcomes (e.g., churn risk, adoption blockers); analyze trends and recurring failures; translate data into executive insights.
Automation & AIEnabled Observability: Automate anomaly detection, escalation, and selfhealing; partner with the AI team; optimize logging and monitoring spend.
Collaboration & Enablement: Champion observability practices across teams; train PMs, QA, and Engineers; ensure insights influence roadmaps; collaborate with Tech Leadership to build observability in from the start.

Who You Are 💪

Experience: 5–8 years in SRE, Observability, or Reliability roles, ideally across multiple product environments (fintech, SaaS, or data platforms).
Technical Skills: Strong in observability tooling (Grafana, Prometheus, OpenTelemetry, ELK); Hands on experience with tracing and profiling tools (APM, OTEL, Pyroscope); experience with distributed systems, APIs, and data pipelines; strong automation skills (Kubernetes).
Strong programming skills with working knowledge of at least one programming language; C# and Go are preferred, but experience in other languages will also be considered valuable.
Mindset:

Systems thinker who sees the big picture.
Businessaware, connecting reliability to retention and profitability.
Proactive, anticipating failures before they occur.
Collaborative, working across product, QA, engineering, and reliability.

Great to haves

Experience in fintech or highavailability SaaS environments.
Familiarity with payments infrastructure and fraud detection systems.
Contributions to opensource observability tools or frameworks.

Why This Role Matters at Flinks 💡

Ensures all products have consistent reliability and observability standards.
Provides a single source of truth for performance and reliability across the org.
Directly improves client trust, profitability, and operational efficiency.
Enables proactive stability management across Flinks’ core product lines.
Supports our shift to a cohesive, reliable, platformfirst mindset at scale.

The Interview Process 🏗

Head of People
Director of IT Ops
Technical Challenge
Panel Interview

Required profile

Are you interested?

Site Reliability Engineer (SRE) Related jobs

Team Lead, Site Reliability Engineering

Today

Pythian

Full time

Google Cloud Platform (GCP)KubernetesPython (Programming Language)Site Reliability Engineering

Site Reliability Engineer (SRE) – Azure & SaaS Platforms

Today

Xplor

Full time

Microsoft AzureCI/CDInfrastructure as Code (IaC)Continuous Monitoring

Site Reliability Engineer (SRE) – Azure & SaaS Platforms

Today

Xplor Education

Full time

Microsoft AzureCI/CDScriptingInfrastructure as Code (IaC)

Team Lead, Site Reliability Engineering

Today

Pythian

Full time

Google Cloud Platform (GCP)Automated Information SystemsKubernetesSite Reliability Engineering

Senior Site Reliability Engineer

Today

GoDaddy

Full time

CI/CDCloud ComputingOpenStackSite Reliability Engineering

See more Site Reliability Engineer (SRE) jobs

Site Reliability Engineer Observability

Offer summary

Qualifications:

Key responsibilities:

Job description

About Flinks 🚀

What Youll Be Doing 🔥

Who You Are 💪

Why This Role Matters at Flinks 💡

Required profile

Experience

Hard Skills

Other Skills

Site Reliability Engineer (SRE) Related jobs

Team Lead, Site Reliability Engineering

Site Reliability Engineer (SRE) – Azure & SaaS Platforms

Site Reliability Engineer (SRE) – Azure & SaaS Platforms

Team Lead, Site Reliability Engineering

Senior Site Reliability Engineer