Principal Engineer, Production Operations

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Strong technical expertise in Site Reliability Engineering (SRE) and cloud infrastructure., Experience designing, building, and maintaining scalable, secure systems in AWS., Proven ability to define and evolve SRE strategies, SLIs/SLOs, and error budgets., Excellent incident response, troubleshooting, and automation skills..

Key responsibilities:

  • Own and evolve the Production Operations strategy and roadmap.
  • Lead incident response efforts and conduct postmortems.
  • Design and scale cloud infrastructure focusing on high availability and security.
  • Mentor engineers and promote best practices in reliability and scalability.

Greenlight logo
Greenlight Financial Services Scaleup https://www.greenlight.com/
201 - 500 Employees
See all jobs

Job description

Greenlight is the leading family fintech company on a mission to help parents raise financially smart kids. We proudly serve more than 6 million parents and kids with our awardwinning banking app for families. With Greenlight, parents can automate allowance, manage chores, set flexible spend controls, and invest for their family’s future. Kids and teens learn to earn, save, spend wisely, and invest.

At Greenlight, we believe every child should have the opportunity to become financially healthy and happy. It’s no small task, and that’s why we leap out of bed every morning to come to work. Because creating a better, brighter future for the next generation depends on it.

Greenlight is looking for a Principal Engineer, Production Operations to join our growing team!

As a Staff Engineer, you will be a technical leader and individual contributor within our production operations function. You will be responsible for designing, building, and maintaining highly reliable, scalable, and performant cloud infrastructure and systems. You will play a critical role in driving technical excellence, mentoring junior engineers, and solving our most complex scalability and reliability challenges.

Your daytoday:
  • Deep technical expertise in Site Reliability Engineering (SRE) and cloud infrastructure, with a strong track record of driving operational excellence at scale.
  • Proven ability to define and evolve SRE strategy, SLIsSLOs, and error budgets in alignment with business and product goals.
  • Extensive experience architecting, building, and maintaining highly available, secure, and scalable systems in AWS.
  • Strong incident response and triage skills, with experience leading critical outages and conducting blameless postmortems to drive systemic improvements.
  • A systemsthinking mindset focused on longterm reliability, root cause analysis, and continuous improvement.
  • Passion for automation and infrastructureascode, with a history of reducing manual toil and improving system resilience through tooling and process innovation.
  • Curiosity and technical depth to assess, prototype, and scale emerging SRE and cloud technologies.
  • Ability to influence and collaborate across engineering, product, and security teams to embed reliability and scalability into product architecture and development workflows.
  • Thought leadership in the SRE & cloud space, including the ability to mentor engineers informally and drive best practices across teams.

  • What you’ll bring to the team:
  • Own and evolve the Production Operations strategy and roadmap in partnership with engineering and product leadership.
  • Define and implement reliability standards across services, including SLIsSLOs and error budgets.
  • Design, implement, and scale cloud infrastructure with a focus on high availability, security, and performance (primarily on AWS).
  • Lead highimpact incident response efforts, and drive followups to ensure longterm resolutions and knowledge sharing.
  • Identify and eliminate sources of operational toil through automation and tooling.
  • Continuously improve monitoring, alerting, and observability across systems and services.
  • Evaluate and introduce new tools, platforms, and practices to enhance system reliability and engineering velocity.
  • Collaborate with crossfunctional teams to embed SRE principles throughout the development lifecycle.
  • Act as a technical expert and advocate for reliability engineering across the organization.

  • Technologies we use:
  • AWS
  • MySQL, DynamoDB, Redis
  • GitHub Actions for CI pipelines
  • Kubernetes (specifically EKS)
  • Ambassador, Helm, Argo CD, LinkerD
  • REST, gRPC, graphQL
  • React, Redux, Swift, Node.js, Kotlin, Java, Go, Python
  • Datadog, Prometheus

  • Work perks at Greenlight:
  • Medical, dental, vision, and HSA match
  • Paid life insurance, AD&D, and disability benefits
  • Traditional 401k with company match
  • Unlimited PTO
  • Paid company holidays and popup bonus holidays
  • Professional development stipends
  • Mental health resources
  • 1:1 financial planners
  • Fertility healthcare
  • 100% paid parental and caregiving leave, plus cleaning service and meals during your leave
  • Flexible WFH, both remote and inoffice opportunities
  • Fully stocked kitchen, catered lunches, and occasional inoffice happy hours
  • Employee resource groups
  • Our stance on salaries:
    Greenlight provides a competitive compensation package with a marketbased approach to pay and will vary depending on your location, experience and skill set. The total compensation package for this position will also include a discretionary performance bonus, equity rewards, medical benefits, 401K match, and more. Greenlight conducts continuous compensation evaluations across departments and geographies to ensure we are keeping our pay current and competitive.

    The estimated base pay range for this position in (NY, CA, WA): $190,000250,000
    The estimated base pay range for this position in (CO): $190,000240,000

    Who we are:
    It takes a special team to aim for a neverbeendonebefore mission like ours. We’re looking for people who love working together because they know it makes us stronger, people who look to others and ask, “How can I help?” and then “How can we make this even better?” If you’re ready to roll up your sleeves and help parents raise a financially smart generation, apply to join our team.
  • Required profile

    Experience

    Level of experience: Senior (5-10 years)
    Industry :
    Financial Services
    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Other Skills

    • Systems Thinking
    • Mentorship
    • Collaboration
    • Communication

    Related jobs