Logo for i4DM

Associate Site Reliability Engineer

Roles & Responsibilities

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical field, or equivalent practical experience.
  • 1–3 years of experience in Site Reliability Engineering, DevOps, systems administration, cloud operations, platform support, software engineering, or a related technical role.
  • Foundational understanding of Linux systems, cloud infrastructure concepts, networking basics, and application support in enterprise environments.
  • Exposure to scripting or programming using languages such as Python, Bash, PowerShell, or similar technologies.

Requirements:

  • Support senior engineers and the Technical Director’s team in day-to-day Site Reliability Engineering activities across platform services and hosted applications.
  • Assist with monitoring, logging, alerting, and dashboard maintenance to improve visibility into system health and application performance.
  • Contribute to simple automation tasks, scripts, and pipeline updates that reduce manual effort and improve operational consistency.
  • Support cloud and hosted environments in AWS and container-based platforms by performing routine operational tasks, checks, and updates.

Job description

Description

About Our Team 

Our employees thrive in a culture that is fast-paced, collaborative, and ego-free, where innovation and teamwork are encouraged at every level. We provide Federal agencies with immediate access to highly skilled professionals who understand complex mission challenges and deliver efficient, scalable solutions. By continuously investing in talent, technology, and specialized capabilities, we maintain expert teams prepared to support evolving Federal missions through tailored technical solutions and modern service delivery approaches. 

We value diverse perspectives and strive to attract talent from all backgrounds. We are seeking professionals who are passionate about technology, mission success, and solving complex operational challenges with creativity and purpose. If you enjoy expanding your technical expertise while supporting impactful Federal initiatives, you will thrive within our organization. Veterans and military spouses are strongly encouraged to apply and bring their valuable experience to our team. 


About the Role 

We are seeking a motivated and detail-oriented Associate Site Reliability Engineer to support the Technical Director’s team in advancing site reliability engineering, cloud operations, automation, and resilient service delivery for VA enterprise healthcare platforms and applications. 

In this role, you will work closely with senior engineers, the Technical Director, platform and operations teams, and VA stakeholders to support the availability, performance, and operational stability of mission-critical enterprise environments. 

The Associate Site Reliability Engineer will help apply software engineering and operational best practices to improve monitoring, automation, incident response, and service reliability while building foundational experience in cloud and platform engineering within a Federal environment. 

 

RESPONSIBILITIES 

Site Reliability Engineering Support 

  • Support senior engineers and the Technical Director’s team in day-to-day Site Reliability Engineering activities across platform services and hosted applications. 
  • Assist with maintaining service reliability, availability, and performance by following established operational practices, runbooks, and team standards. 
  • Help gather and review operational metrics, alerts, and system health information to identify issues and support service improvements. 

Monitoring, Observability & Incident Support 

  • Assist with monitoring, logging, alerting, and dashboard maintenance to improve visibility into system health and application performance. 
  • Participate in incident response activities, service restoration efforts, and post-incident follow-up under the guidance of senior team members. 
  • Support documentation of incidents, recurring issues, and operational procedures to improve team readiness and response consistency. 

Automation, CI/CD & Platform Operations 

  • Contribute to simple automation tasks, scripts, and pipeline updates that reduce manual effort and improve operational consistency. 
  • Support CI/CD processes and environment maintenance for application and infrastructure delivery in development, test, and production environments. 
  • Assist with infrastructure and configuration changes using approved tools, templates, and team guidance. 

Cloud & Environment Support 

  • Support cloud and hosted environments in AWS and container-based platforms by performing routine operational tasks, checks, and updates. 
  • Help maintain system documentation, inventory, and configuration information for services and environments managed by the team. 
  • Assist with validation, testing, and operational readiness activities for new releases and environment changes. 

Security, Compliance & Team Collaboration 

  • Follow established security, access, and operational procedures that support Federal compliance and secure system administration. 
  • Collaborate with software, infrastructure, operations, and support teams to resolve issues and support reliable service delivery. 
  • Continuously develop technical skills in cloud engineering, observability, automation, and reliability practices through hands-on work and mentorship. 


TAG: #LI-I4DM

TAG: INDMJC


Requirements

QUALIFICATIONS 

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical field, or equivalent practical experience. 
  • 1–3 years of experience in Site Reliability Engineering, DevOps, systems administration, cloud operations, platform support, software engineering, or a related technical role. 
  • Foundational understanding of Linux systems, cloud infrastructure concepts, networking basics, and application support in enterprise environments. 
  • Exposure to scripting or programming using languages such as Python, Bash, PowerShell, or similar technologies. 
  • Familiarity with monitoring, logging, troubleshooting, and incident response concepts. 
  • Basic knowledge of CI/CD, version control, automation, or Infrastructure as Code concepts. 
  • Ability to follow technical procedures, learn new tools quickly, and work effectively in a collaborative team environment. 
  • Candidates must be eligible to obtain and maintain a Public Trust clearance. 

PREFERRED QUALIFICATIONS 

  • Internship, academic, lab, or hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud. 
  • Familiarity with containers and orchestration technologies such as Docker, Kubernetes, EKS, or ECS. 
  • Exposure to observability and monitoring tools such as CloudWatch, Grafana, Prometheus, ELK, or Splunk. 
  • Experience with Git-based workflows, pipeline tooling, or automation through coursework, labs, or professional experience. 
  • Understanding of Federal security, compliance, or healthcare technology environments. 
  • Relevant foundational certifications such as AWS Certified Cloud Practitioner, AWS Certified Developer Associate, CompTIA Linux+, Security+, or HashiCorp Terraform Associate.



Site Reliability Engineer (SRE) Related jobs

Other jobs at i4DM

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.