About Flinks
Flinks is where financial data moves—with purpose, trust, and impact.
We’re on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial products and experiences. Since 2016, we’ve been bridging the gap between fintechs, financial institutions, and consumers by enabling seamless, secure data connectivity.
From instant account funding to smarter lending, our solutions help power some of the most innovative financial products in North America. We partner with lenders, banks, and fintechs to streamline onboarding, prevent fraud, and fuel realtime decisionmaking with enriched, reliable data.
As pioneers in Canada’s open banking movement, were not waiting for the future—were building it. If youre bold, curious, and ready to help shape the future of finance, we’d love to meet you.
About the Reliability Team 🚒
As a Reliability Engineer, you will play a pivotal role in ensuring the stability, performance, and reliability of Flinks Fintech product platforms, and monitoring & alerting systems. You will serve as an expert in both software development and system support, working closely with engineering, operations, and product teams to troubleshoot complex issues, resolve incidents, and continuously improve the technical foundation of our products. This role demands a combination of advanced coding skills, incident management experience, and an understanding of the fintech industry.
What You’ll Do
- Develop and maintain code to quickly resolve product issues, ensuring fast recovery and longterm system stability.
-
- Provide live operational support across multiple client applications, monitoring services and alerts to detect and resolve critical failures with minimal downtime.
-
- Own and troubleshoot complex incidents, conduct root cause analyses, and implement longterm solutions—adhering to SLAs and internal SLOs.
-
- Build monitoring dashboards and alerting systems to proactively detect and address issues, supporting system scalability and stability.
-
- Analyze operational metrics and KPIs to identify trends, surface client pain points, and drive improvements.
-
- Automate tooling and processes to improve efficiency and reduce manual work across LiveOps.
-
- Collaborate with crossfunctional teams to deliver lasting fixes for production issues and contribute to technical analyses of product gaps.
-
- Lead and mentor reliability engineers, providing guidance and ensuring consistent delivery of highquality work.
-
- Participate in postincident reviews, documenting outcomes and driving preventative action items.
-
- Support afterhours oncall coverage as part of the LiveOps rotation
-
Who You Are 💪
- 5+ years of experience with .NET Framework (C#), ensuring production system stability
-
- Strong coding, debugging, and troubleshooting skills, particularly in performance optimization of largescale applications
-
- Operationally focused with expertise in incident management and resolving live production issues
-
- Proven experience in building and maintaining reliable monitoring and alerting systems in highdemand environments, with a focus on production support
-
- Strong knowledge of Kubernetes, Docker, and cloud platforms (GCP preferred)
-
- Proficiency with monitoring tools like Prometheus, Grafana, and Kibana
-
- Experience with incident ticketingdocumentation tools like FreshDesk and Confluence
-
- Critical thinker who can identify system weaknesses and find innovative solutions
-
- Strong project management skills with a focus on scalability and system stability
-
Nice to haves
- ITIL Service Management certification (or equivalent) is highly desired, such as ITIL v3, ITIL v4, or other equivalent certifications.
-
- Experience with PowerBI, web scraping, or Golang
-
The Interview Process 🏗
- Head of People Ops
-
- Case Assignment & Presentation
-
- Team Lead Interview
-
- Director Interview
-
