Match score not available

Site Reliability Engineer SME

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

BS in Engineering, Computer Science or related field, 10+ years experience in IT projects, 4+ years experience with DevSecOps, 3+ years experience with containerized applications.

Key responsabilities:

  • Maintain reliability of platform services
  • Collaborate with business leaders on production systems development
  • Implement monitoring and automation tools
  • Design & maintain core infrastructure for scaling
  • Debug production issues across services
Rackner logo
Rackner Scaleup https://rackner.com/
51 - 200 Employees
See more Rackner offers

Job description

Title: Site Reliability Engineer SME
Location: Remote (Personnel may be remote but may be required to be onsite to access the SCIF facilities in San Antonio the first 2 weeks)
Clearance: Top Secret (SCI eligible)

About this role:

  • Rackner is seeking a Site Reliability Engineer SME to work with the DSO Platform Product Line Manager and Infrastructure as Code SME, as well as other team members to ensure that the overall platform design and implementation is able to meet or exceed the service level objectives and agreements relating to uptime, availability, and mean-time-to-recover.

We are seeking professionals with:

  • BS; Engineering, Computer Science, or technical degree or industry experience equivalent
  • 10+ years; Contributing on a technical team for a software or IT project. 
  • 4+ years; use of DevSecOps in support of system integrity, availability, and security
  • 3+ years; Development containerized applications and delivery to a containerized platform
  • 3+ years; Providing technical guidance to more junior team members

What will make you successful:

  • Manage and maintain availability and reliability of critical platform services and applications, ensuring they meet requirements of internal and external users
  • Collaborate with business leaders in building and running sustainable production systems, which can evolve and adapt to changes in a global business environment
  • Evaluates performance results and recommends major changes affecting short-term project growth and successRun infrastructure with Chef, Ansible, Terraform, GitLab, CI/CD and Kubernetes
  • Build monitoring that alerts on symptoms rather than on outages
  • Document every action so your findings turn into repeatable actions and then into automation
  • Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible
  • Improve operational processes
  • Design build and maintain core infrastructure that enables GitLab scaling to support hundreds of thousands of concurrent users
  • Debug production issues across services and levels of the stack

Nice to have:

  • M.S.; Computer Science or Engineering field 
  • 10+ years; Contributing on a technical team
  • 5+ years; providing technical guidance to more junior team members
  • DoD 8570/8140 IASAE Level II at start

Who We Are:

  • Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector.
  • We are an energetic, growing consultancy with a passion for solving big problems for both startups and enterprises.
  • Each of us enable digital transformation for large organizations through the newest in distributed technologies as we are laser focused on end-to-end application development, DevSecOps, AI/ML and systems architecture and our methodology focuses on cloud-first and cost-effective innovation.
  • Our customers hail from a diverse, ever-growing list of industries.

Benefits/Additional Info:

Rackner embraces  and promotes employee development and training and covers the cost of certifications relevant to a position and the technologies/services provided .  Fitness/Gym membership eligibility, weekly pay schedule and employee swag, snacks & events are offered as well!

  • 401K with 100% matching up to 6%
  • Highly competitive PTO
  • Great health insurance with large network of providers
  • Medical/Dental/Vision
  • Life Insurance, and short & long term disability
  • Industry-Leading Weekly Pay Schedule
  • Home office & equipment plan

 

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Site Reliability Engineer (SRE) Related jobs