Logo for Arctiq

Site Reliability Engineer

Roles & Responsibilities

  • 3–5 years of experience in SRE, DevOps, or Systems Engineering roles
  • Proficiency in scripting languages (Python, Go, or Bash)
  • Hands-on experience with containerization (Docker, Kubernetes) and cloud platforms (AWS, Azure, or GCP)
  • Familiarity with NIST SP 800-53 security controls

Requirements:

  • Monitor and maintain observability with dashboards and alerts using Prometheus, Grafana, or ELK; identify SLIs
  • Develop and maintain Infrastructure as Code using Terraform and Ansible for repeatable deployments
  • Maintain automated CI/CD pipelines with integrated security scans and automated tests
  • Participate in on-call rotations and contribute to blameless post-mortem reports to drive continuous improvement

Job description

Arctiq is a global, intelligence-driven technology services company delivering professional and managed services across Hybrid Cloud Infrastructure, Networking & Connected Experiences, Cybersecurity, Data & AI, Autonomous Operations & Intelligence, and Enterprise Service Management. We help organizations operate, secure, and modernize complex environments by unifying infrastructure, networking, data, security, automation, and observability under a single, integrated operating model. Our work focuses on helping customers reduce operational friction, improve resilience, and make better, faster decisions as their environments evolve. Arctiq builds on decades of industry expertise and a customer-centric ethos to deliver exceptional value to clients across diverse industries.


The Site Reliability Engineer will focus on the execution and maintenance of reliability engineering practices for mission-critical government systems. Following the SRE Implementation Plan, you will bridge the gap between development and operations by applying a software engineering mindset to system administration. You will be responsible for building automation, maintaining CI/CD pipelines, and ensuring system health through robust monitoring.


Key Responsibilities

  • Monitoring & Observability: Implement and maintain dashboards and alerting rules using Prometheus, Grafana, or ELK Stack. Support the identification of Service Level Indicators (SLIs).
  • Automation: Develop and maintain Infrastructure as Code (IaC) scripts using Terraform and Ansible to ensure repeatable, error-free deployments.
  • CI/CD Management: Maintain automated deployment pipelines, ensuring security scans and automated tests are integrated into the workflow.
  • Incident Response: Participate in on-call rotations and assist in troubleshooting system outages. Contribute to blameless post-mortem reports to drive continuous improvement.
  • Toil Reduction: Identify repetitive manual tasks and develop automation to reduce "toil," allowing the team to focus on high-value engineering.


Required Qualifications

  • 3–5 years of experience in SRE, DevOps, or Systems Engineering roles.
  • Proficiency in scripting languages (Python, Go, or Bash).
  • Hands-on experience with containerization (Docker, Kubernetes) and cloud platforms (AWS, Azure, or GCP).
  • Familiarity with NIST SP 800-53 security controls.
  • Education: Bachelor’s degree in Computer Science or a related technical field.

Site Reliability Engineer (SRE) Related jobs

Other jobs at Arctiq

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.