Match score not available

Senior Reliability Engineer (Observability) - remote across ANZ

EXTRA PARENTAL LEAVE - FULLY FLEXIBLE
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Proficient in Python, Java, or Golang, Deep understanding of Computer Engineering fundamentals, Solid experience with AWS or equivalent services, Knowledge of Observability Tooling like Elasticsearch and Grafana, Experience with Infrastructure as Code (Terraform).

Key responsabilities:

  • Build/improve observability platform/tooling
  • Provide technical leadership and expertise
  • Optimize tracing platform and increase reliability
  • Advocate best practices and streamline user experiences
  • Guide team and improve engineer insights
Canva logo
Canva Large https://www.canva.com/
1001 - 5000 Employees
See more Canva offers

Job description

Logo Jobgether

Your missions

Job Description

Join the team redefining how the world experiences design.

Hey, g'day, mabuhay, kia ora, 你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.

Where and how you can work

Our flagship campus is in Sydney. We also have a campus in Melbourne and co-working spaces in Brisbane, Perth and Adelaide. But you have choice in where and how you work, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.

What you’d be doing in this role

As Canva scales change continues to be part of our DNA. But we like to think that's all part of the fun. So this will give you the flavour of the type of things you'll be working on when you start, but this will likely evolve.

At the moment, this role is focused on:

  • Being responsible for building and improving our observability platform and tooling, which is used by all Canva engineers.
  • Providing technical leadership and expertise to drive pragmatic solutions and dive into impactful design decisions.
  • Brainstorming, researching and prototyping to optimize our tracing platform, improve our operational effectiveness and increase reliability.
  • Being proactive in improving the tracing user experience and advocating for best practices.
  • Participating in team ceremonies, knowledge sharing and brainstorming sessions.
  • Becoming an observability champion, evangelising best practices and guiding other Canvanauts in the observability space.
  • Finding ways to improve the use of traces and provide better insights to our engineers.

You're probably a match if

  • You are proficient and happy to code in Python, Java or Golang.
  • You have deep knowledge and understanding of Computer Engineering fundamentals and first principles.
  • You are proficient with infrastructure-as-code - we’re a Terraform shop, but strong experience with other IaC tools will do the trick.
  • You have a solid knowledge of AWS (EC2, EKS, Lambda, SQS, Kinesis, S3) or equivalent.
  • You have experience with Observability Tooling – having competency with tools like Elasticsearch, Grafana, Jaegar Tracing or similar.
  • Experience running highly available and reliable distributed systems, with highly scalable data stores.

Not essential; but helpful experience!

  • You have experience with OpenTelemetry because it underpins a lot of the infrastructure and tooling that the team owns.
  • You have experience writing application code in Java or frontend code in TypeScript, since we also maintain the tracing libraries.
  • You have experience building and running monitoring infrastructure at scale. For example, Petabyte-scale Elasticsearch clusters or similar databases.
  • You have experience with data handling at scale.
  • You have experience deploying and running workloads on Kubernetes.

About the team

The Observability Traces & Exceptions Team is responsible for operational insights inside Canva. Our goal is to provide our development team with world-class tools to view how their services are performing in production. We achieve this by combining industry-leading third-party solutions with our own solutions developed in-house.

We work across the entire stack maintaining our TypeScript and Java tracing libraries, our tracing infrastructure, error reporting libraries and error handling guidelines to name just a few. As we scale all of these areas, we require more sophisticated solutions to ensure that Canva developers can continue to grow without compromising on reliability or availability.

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a range of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

  • Equity packages - we want our success to be yours too
  • Inclusive parental leave policy that supports all parents & carers
  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out lifeatcanva.com for more info.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

We celebrate all types of skills and backgrounds at Canva so even if you don’t feel like your skills quite match what’s listed above - we still want to hear from you!

Please note that interviews are conducted virtually.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
Check out the description to know which languages are mandatory.

Soft Skills

  • Leadership Development
  • Open Mindset
  • Verbal Communication Skills

Site Reliability Engineer Related jobs