Match score not available

Senior Site Reliability Engineer

extra parental leave - fully flexible
Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

5+ years in various SRE roles, 5+ years in DevOps/System Engineering roles, Experience building and managing production Kubernetes infrastructure, Proficiency in scripting languages such as Python, Ruby, and Bash..

Key responsabilities:

  • Build systems to optimize uptime and reliability of the platform
  • Maintain and engineer production and non-production Kubernetes clusters
  • Collaborate with cross-functional teams to optimize resource utilization
  • Develop processes and automation to reduce toil across engineering teams.

Roadie logo
Roadie Information Technology & Services Scaleup https://www.roadie.com/
201 - 500 Employees
See all jobs

Job description

Roadie, a UPS Company, is a logistics management and crowdsourced delivery platform. Founded in 2014, Roadie offers businesses fast, flexible and asset-light logistics solutions for last-mile delivery. Roadie enables local delivery to more than 95% of U.S. households by providing access to more than 200,000 independent drivers nationwide – allowing businesses to offer their customers delivery optionality for almost any industry, from airlines to artisans.

Roadie is seeking a Senior Site Reliability Engineer to join our growing Technical Operations Team. We are looking for a candidate who has experience implementing site reliability principals, as well as production level Kubernetes experience.  The ideal candidate is a skilled problem solver with intimate knowledge of site reliability practices, standard dev ops principles, AWS, scripting languages and Kubernetes. 

What You'll Do

  • Build systems that optimize the uptime and reliability of our platform, and support the management and optimization of our software delivery pipeline, observability and infrastructure operations
  • Maintain, support, and engineer production and non-production Kubernetes Clusters (EKS) as well as ES, MSK, RDS, and EC (Redis) clusters
  • Deploy and maintain monitoring and logging solutions based on Prometheus, Loki, Thanos, Grafana, OpenTelemetry and New Relic
  • Collaborate with cross-functional teams to identify and address potential bottlenecks, optimize resource utilization, and proactively prevent system failures
  • Define and manage SLO, SLI and error budgets
  • Develop processes, tools and automation to reduce toil across engineering teams
  • Plan and forecast service capacity and demand, assess cost optimization, and tune systems and software
  • Debug production / non-production issues
  • Take part in 24/7 on-call rotation

Technology We're Using Now

  • Python, Ruby on Rails, Golang
  • React/Redux, Objective-C and Swift, Android
  • Postgres, Redshift, Redis, Kafka
  • AWS/GCP
  • Docker/Kubernetes
  • OpenTelemetry/Prometheus/Thanos/Loki/Grafana/New Relic/Sentry
  • Git/CircleCI
  • ArgoCD

What You Bring

  • 5+ Years in various SRE roles
  • 5+ Years in various DevOPS/System Engineering roles
  • 5+ Years of experience building and managing production Kubernetes infrastructure
  • 6+ Years experience with popular scripting languages (Python, Ruby, Bash, etc.)
  • Experience with Infrastructure as code such as Terraform or Crossplane
  • Experience with CI/CD Development tools (CircleCI, etc.)
  • Experience with GitOPS Tools (ArgoCD)
  • Experience using a broad range of AWS technologies (RDS, ElasticSearch, VPC, EKS, S3, CloudFront, MSK, Elasticache, CloudWatch, etc.)
  • Experience developing and maintaining YAML templating systems (Helm charts, Kustomize, etc)
  • Must be able to work independently, be self-motivated and handle multiple priorities
  • Comfortable working in a fast-paced agile environment

Finally, a willingness to admit what you don’t know, and learn what you need to learn quickly

Why Roadie? 

  • Competitive compensation packages 
  • 100% covered health insurance premiums for yourself
  • 401k with company match
  • Tuition and student loan repayment assistance (that’s right - Roadie will contribute directly to your existing student loans!) 
  • Flexible work schedule with unlimited PTO 
  • Monthly 3-day weekends
  • Monthly WFH stipend 
  • Paid sabbatical leave- tenured team members are given time to rest, relax, and explore
  • The technology you need to get the job done

This role is not eligible for Visa sponsorship. Applicants must be authorized to work for any employer in the U.S.

Required profile

Experience

Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Time Management
  • Self-Motivation

Site Reliability Engineer (SRE) Related jobs