Logo for Foundation Finance Company Careers

Senior Site Reliability Engineer

Roles & Responsibilities

  • Bachelor’s degree in computer science, engineering, or related field and minimum 5 years’ experience in Site Reliability Engineering, DevOps, or Systems Engineering roles
  • Experience with AWS (multi-account), Terraform, Ansible, CI/CD systems (GitHub Actions, Bitbucket, Jenkins, AWS CodeBuild, AWS CodePipeline) and observability platforms (New Relic, CloudWatch), plus background with containers (ECS/Fargate/EKS) and resilient architectures
  • AWS Certified DevOps Engineer or Solutions Architect (preferred but not required)
  • Kubernetes or container certification; SRE/DevOps practitioner certifications

Requirements:

  • Define and maintain SLOs, SLIs, and error budgets for critical services; lead capacity planning and availability reviews; run Game Day and disaster recovery exercises
  • Eliminate manual 'click-ops' by automating infrastructure provisioning, patching, and runtime hygiene using Terraform, Ansible, and CI/CD pipelines; develop tooling to enforce compliance
  • Serve as Tier-2 escalation during incidents; lead root cause analysis and postmortems; improve incident response playbooks and on-call rotations; reduce MTTD/MTTR through automation
  • Embed security best practices in system design; work with Security on least-privileged IAM, patch compliance, and audit readiness; measure reliability and publish metrics

Job description

Overview:

Senior Site Reliability Engineer & Duties:

The Senior Site Reliability Engineer (Sr. SRE) ensures the reliability, performance, and scalability of all platform services. This role combines software engineering and operations expertise to build resilient systems, automate manual work, improve observability, and reduce operational risk. The Sr. SRE partners closely with DevOps, Release Engineering, and Security to embed reliability practices into every stage of the software lifecycle.

Pay Range: USD $120,000.00 - USD $125,000.00 /Yr. Responsibilities:

 

Essential Duties and Responsibilities:

  • Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for critical services.
  • Lead capacity planning, performance tuning, design tuning and availability reviews with product and engineering teams to evaluate and develop system health. Run regular Game Day and disaster recovery exercises to validate platform resilience.
  • Eliminate manual “click-ops” by automating infrastructure provisioning, patching, and runtime hygiene using Terraform, Ansible, and CI/CD pipelines.
  • Develop tooling to enforce compliance (SOC2, CIS benchmarks) across environments.
  • Serve as Tier-2 escalation during incidents to quickly and effectively solve problems and lead technical deep dives and root cause analysis.
  • Partner with the Incident Response to improve playbooks, on-call rotations, and postmortems. Reduce mean time to detect (MTTD) and mean time to recovery (MTTR) through automation and proactive engineering.
  • Embed security best practices in system design (IMDSv2, hardened golden images, secret rotation). Work with Security to maintain least-privileged IAM policies, patch compliance, and audit readiness.
  • Identify sources of toil and drive automation to eliminate repetitive manual tasks. Contribute to platform “blueprints” and self-service modules so development teams can operate within reliable guardrails.
  • Measure system performance and track and publish reliability metrics to leadership; use data to drive iterative improvements, minimize risk, and push system capabilities forward.
  • Other duties as assigned by management. Must be able to come to work promptly and regularly. Must be able to take direction and work well with others. Must be able to work under the stress of deadlines. Must be able to concentrate and perform accurately. Must be able to react to change productively.
  •  

 

Must be able to come to work promptly and regularly. Must be able to take direction and work well with others. Must be able to work under the stress of deadlines. Must be able to concentrate and perform accurately. Must be able to react to change productively.

 

Qualifications:

Minimum Qualifications:

  • Bachelor’s degree in computer science, engineering, or related field and minimum 5 years’ experience in Site Reliability Engineering, DevOps, or Systems Engineering roles.

  • Experience with AWS (multi-account), Terraform, Ansible, CI/CD systems (GitHub Actions, Bitbucket, Jenkins, AWS Codebuild, AWS Code Pipeline) and observability platforms (New Relic, CloudWatch as well as background with containers (ECS/Fargate/EKS) and resilient architectures required.

    Certificates, Licenses, Registrations (Preferred but not required)  

    • AWS Certified DevOps Engineer or Solutions Architect.
    • Kubernetes or container certification.
    • SRE/DevOps practitioner certifications.
Description:

About Foundation Finance:

Foundation Finance Company (FFC), a Great Place to Work® certified company since 2017, is a fast-growing consumer finance company working with home improvement contractors across the U.S. to drive sales through flexible, customer-focused financing options.
Available Benefits:


· Day-one Health Benefits (medical, dental, vision, and flexible spending options like HSA or FSA accounts).
· 401(k) with company match enrollment on day-one.
· Paid, Sick and Volunteer Time Off
· Paid Parental Leave Options
· Employer Paid Life and Disability
· Wellbeing on Demand Program
· Flexible Work Environment with a casual dress code


*Employment status (full-time or part-time) may affect eligibility for certain benefits. Some benefits become available only after a specified period of employment. Please refer to our Benefits page for details.

Working Conditions:

Office environment with significant time spent sitting, typing and talking on the telephone.

 

Foundation Finance Company provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.

 

If you reside in the state of Colorado, please click on the following link to review our benefits: Foundation Finance Benefits

 

These benefits are designed to support our employees in their professional growth, health, and overall well-being. Eligibility, coverage details, and enrollment processes will be provided during the onboarding process. At Foundation Finance Company, we are committed to fostering a positive work environment where employees can thrive both personally and professionally.

Remote Work:

Foundation Finance Company LLC requires that remote employees must reside in one of the following states to be considered for any of our remote positions: AL, AR, AZ, CO, FL, GA, IL, IN, KY, LA, MD, MI, MN, MO, MS, NC, NJ, NV, NY, OH, OK, OR, SC, TN, TX, UT, VA, WA, and WI.

Site Reliability Engineer (SRE) Related jobs

Other jobs at Foundation Finance Company Careers

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.