At Fellow, our mission is to transform how teams collaborate and make meetings productive for everyone. As we continue to grow, we're seeking a Site Reliability Engineer who will play a pivotal role in scaling and optimizing our infrastructure, ensuring that our AI Meeting Assistant and broader platform remain reliable, secure, and high-performing. In this position, you'll have the opportunity to make a significant impact by designing and maintaining robust systems while experimenting with innovative technologies. If you're passionate about making a meaningful difference in a fast-paced environment and excited to work with a team that's redefining meeting experiences worldwide, we'd love to hear from you.

Engineering at Fellow

At Fellow, our engineering team operates with a strong focus on efficiency, collaboration, and high-quality output. We work in 8-week build cycles that allow teams to concentrate deeply on their projects without the interruptions of weekly sprint planning meetings. A great deal of care is put into breaking down projects to fit into the cycle while allowing for proper QA process, thorough code review throughout, and written tests.

We believe that fostering a strong team culture is essential for innovation and success. We host regular book clubs centered around technical books, organize lunch-and-learns where team members share knowledge and insights, and hold bi-annual hackathons to build out our boldest ideas. We also place great importance on contributing to the engineering community: many team members are involved in open-source projects, publish technical articles, and speak at conferences and local meetups.

Key Responsibilities

System Reliability: Design, implement, and manage reliable, scalable systems to support Fellow’s AI Meeting Assistant and other platform features.
Cloud Infrastructure: Optimize and maintain our AWS infrastructure, including EC2, RDS, and other cloud services.
Kubernetes: Oversee and optimize Kubernetes clusters to ensure high availability and performance.
CI/CD Pipelines: Enhance and maintain CI/CD pipelines to support efficient, high-quality deployments.
Monitoring and Observability: Set up and improve monitoring, logging, and alerting systems to detect and resolve issues proactively.
Collaboration: Work closely with the engineering, product, and QA teams to support feature development and deployment.
Automation: Use tools like Pulumi to automate infrastructure provisioning and management.
Incident Management: Lead root cause analysis and implement changes to prevent future incidents.
Innovation: Experiment with and adopt new technologies to enhance system performance and scalability.

Ideal Candidates

Have 2+ years of experience in site reliability engineering or a related field, with a strong understanding of cloud-based infrastructure.
Proficiency with Kubernetes, AWS, and databases.
Experience with monitoring and observability tools such as Prometheus, Grafana, or Datadog.
Familiarity with CI/CD tools like GitHub Actions, Jenkins, or GitLab CI.
Strong problem-solving skills and a proactive approach to reliability challenges.
Excellent communication skills and the ability to collaborate effectively in a team environment.
Bonus: Experience with Pulumi, ElasticSearch, or MLOps tools is highly valued.

Why Work at Fellow?

Team Culture: Join a collaborative, innovative team that values continuous learning and growth.
Impact: Work on meaningful projects that shape the future of work and make meetings more productive.
Flexibility: We’re a remote-first organization with offices and co-working spaces available in Ottawa (our HQ), Montreal, and Toronto for those who prefer in-person collaboration.
Growth: Be part of a growing, Series A-funded startup backed by leading venture capital firms such as Craft, iNovia, and Felicis.

----

The Fine Print

We’re 100% remote, but candidates must reside in Canada and be legally entitled to work for any employer.
Fellow has broad ambitions. We’re very agile and a place where changes happen fast and all the time. You have to enjoy the challenge of constantly learning and growing in your role, and rolling up your sleeves to make things happen.
Fellow is a startup. Our environment is suited for people who thrive in experimentation, and making educated guesses with, at times, limited information. If you prefer a more structured work environment with well defined boundaries, this is not the place for you.
You’ll have the autonomy to schedule your work to fit your schedule, but generally, everyone at Fellow has meeting availability between 10:00am and 5:00 pm Eastern Time.

Equal Opportunity Employer

At Fellow, we understand the value of having a diverse team. That’s why we believe in providing equal opportunity employment regardless of race, national or ethnic origin, color, religion, age, sex, sexual orientation, gender identity or expression, marital status, family status, genetic characteristics, disability, and conviction. Please let us know if you require accommodation during the recruitment process.

Site Reliability Engineer

Offer summary

Qualifications:

Key responsabilities:

Job description

Required profile

Experience

Hard Skills

Other Skills

Site Reliability Engineer (SRE) Related jobs

Site Engineer

Cloud Site Reliability Engineer

SRE/AWS

SRE (Site Reliability Engineer) [業務委託]

Reliability Engineer