Logo for Arctiq

Senior Site Reliability Engineer

Key Facts

Remote From: 
Freelance
Senior (5-10 years)
English

Other Skills

  • Communication
  • Leadership
  • Mentorship
  • Problem Solving

Roles & Responsibilities

  • 7+ years of experience in SRE/DevOps with distributed systems
  • Expertise in Go, Python, or Java and advanced knowledge of Linux internals
  • Extensive experience managing production Kubernetes environments and complex cloud architectures, with a proven track record of defining and meeting SLOs
  • Education: Bachelor's or Master's degree in Computer Science or Engineering; CKA and industry observability certifications preferred; experience with RMF processes

Requirements:

  • Define SLO strategy and error budgets and design telemetry pipelines for full-stack observability
  • Design and govern enterprise Infrastructure as Code (IaC) standards and develop tooling to automate recovery procedures and system scaling
  • Act as Incident Commander for major outages, leading incident response and conducting Root Cause Analysis (RCA)
  • Lead security-as-code integration within DevSecOps pipelines, ensuring RMF and NIST 800-53 compliance, and provide mentorship across the team

Job description

Arctiq is a global, intelligence-driven technology services company delivering professional and managed services across Hybrid Cloud Infrastructure, Networking & Connected Experiences, Cybersecurity, Data & AI, Autonomous Operations & Intelligence, and Enterprise Service Management. We help organizations operate, secure, and modernize complex environments by unifying infrastructure, networking, data, security, automation, and observability under a single, integrated operating model. Our work focuses on helping customers reduce operational friction, improve resilience, and make better, faster decisions as their environments evolve. Arctiq builds on decades of industry expertise and a customer-centric ethos to deliver exceptional value to clients across diverse industries.


The Senior Site Reliability Engineer is a technical leader responsible for architecting the reliability strategy for large-scale, distributed government systems. You will lead the implementation of the SRE framework, driving the adoption of SLO-based management and advanced automation. As a subject matter expert, you will mentor mid-level engineers and interface with government stakeholders to ensure system resilience and performance meet mission requirements.


Key Responsibilities

  • Reliability Architecture: Define the strategy for Service Level Objectives (SLOs) and Error Budgets. Design complex telemetry pipelines for full-stack observability.
  • Strategic Automation: Design and govern the enterprise Infrastructure as Code (IaC) standards. Develop custom tooling to automate complex recovery procedures and system scaling.
  • Incident Command: Act as the Incident Commander for major system outages, leading the technical response and directing the Root Cause Analysis (RCA) process.
  • Security & Compliance: Lead the integration of security-as-code within DevSecOps pipelines, ensuring full compliance with RMF and NIST 800-53 standards.
  • Mentorship: Provide technical guidance and mentorship to Mid-Level SREs and developers, fostering a culture of reliability across the organization.


Required Qualifications

  • 7+ years of experience in SRE or DevOps, with significant experience in distributed systems.
  • Expertise in Go, Python, or Java and advanced knowledge of Linux internals.
  • Extensive experience managing production Kubernetes environments and complex cloud architectures.
  • Proven track record of defining and meeting SLOs for high-availability systems.
  • Experience navigating government Risk Management Framework (RMF) processes.
  • Education: Bachelor’s or Master’s degree in Computer Science or Engineering.
  • Certifications: CKA (Certified Kubernetes Administrator) and industry observability certification preferred

Site Reliability Engineer (SRE) Related jobs

Other jobs at Arctiq

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.