Job description

Description

About Our Team

Our employees thrive in a culture that is fast-paced, collaborative, and ego-free, where innovation and teamwork are encouraged at every level. We provide Federal agencies with immediate access to highly skilled professionals who understand complex mission challenges and deliver efficient, scalable solutions. By continuously investing in talent, technology, and specialized capabilities, we maintain expert teams prepared to support evolving Federal missions through tailored technical solutions and modern service delivery approaches.

We value diverse perspectives and strive to attract talent from all backgrounds. We are seeking professionals who are passionate about technology, mission success, and solving complex operational challenges with creativity and purpose. If you enjoy expanding your technical expertise while supporting impactful Federal initiatives, you will thrive within our organization. Veterans and military spouses are strongly encouraged to apply and bring their valuable experience to our team.

About the Role

We are seeking a motivated and detail-oriented Associate Site Reliability Engineer to support the Technical Director’s team in advancing site reliability engineering, cloud operations, automation, and resilient service delivery for VA enterprise healthcare platforms and applications.

In this role, you will work closely with senior engineers, the Technical Director, platform and operations teams, and VA stakeholders to support the availability, performance, and operational stability of mission-critical enterprise environments.

The Associate Site Reliability Engineer will help apply software engineering and operational best practices to improve monitoring, automation, incident response, and service reliability while building foundational experience in cloud and platform engineering within a Federal environment.

RESPONSIBILITIES

Site Reliability Engineering Support

Support senior engineers and the Technical Director’s team in day-to-day Site Reliability Engineering activities across platform services and hosted applications.
Assist with maintaining service reliability, availability, and performance by following established operational practices, runbooks, and team standards.
Help gather and review operational metrics, alerts, and system health information to identify issues and support service improvements.

Monitoring, Observability & Incident Support

Assist with monitoring, logging, alerting, and dashboard maintenance to improve visibility into system health and application performance.
Participate in incident response activities, service restoration efforts, and post-incident follow-up under the guidance of senior team members.
Support documentation of incidents, recurring issues, and operational procedures to improve team readiness and response consistency.

Automation, CI/CD & Platform Operations

Contribute to simple automation tasks, scripts, and pipeline updates that reduce manual effort and improve operational consistency.
Support CI/CD processes and environment maintenance for application and infrastructure delivery in development, test, and production environments.
Assist with infrastructure and configuration changes using approved tools, templates, and team guidance.

Cloud & Environment Support

Support cloud and hosted environments in AWS and container-based platforms by performing routine operational tasks, checks, and updates.
Help maintain system documentation, inventory, and configuration information for services and environments managed by the team.
Assist with validation, testing, and operational readiness activities for new releases and environment changes.

Security, Compliance & Team Collaboration

Follow established security, access, and operational procedures that support Federal compliance and secure system administration.
Collaborate with software, infrastructure, operations, and support teams to resolve issues and support reliable service delivery.
Continuously develop technical skills in cloud engineering, observability, automation, and reliability practices through hands-on work and mentorship.

TAG: #LI-I4DM

TAG: INDMJC

Requirements

QUALIFICATIONS

Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical field, or equivalent practical experience.
1–3 years of experience in Site Reliability Engineering, DevOps, systems administration, cloud operations, platform support, software engineering, or a related technical role.
Foundational understanding of Linux systems, cloud infrastructure concepts, networking basics, and application support in enterprise environments.
Exposure to scripting or programming using languages such as Python, Bash, PowerShell, or similar technologies.
Familiarity with monitoring, logging, troubleshooting, and incident response concepts.
Basic knowledge of CI/CD, version control, automation, or Infrastructure as Code concepts.
Ability to follow technical procedures, learn new tools quickly, and work effectively in a collaborative team environment.
Candidates must be eligible to obtain and maintain a Public Trust clearance.

PREFERRED QUALIFICATIONS

Internship, academic, lab, or hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud.
Familiarity with containers and orchestration technologies such as Docker, Kubernetes, EKS, or ECS.
Exposure to observability and monitoring tools such as CloudWatch, Grafana, Prometheus, ELK, or Splunk.
Experience with Git-based workflows, pipeline tooling, or automation through coursework, labs, or professional experience.
Understanding of Federal security, compliance, or healthcare technology environments.
Relevant foundational certifications such as AWS Certified Cloud Practitioner, AWS Certified Developer Associate, CompTIA Linux+, Security+, or HashiCorp Terraform Associate.

Associate Site Reliability Engineer

Role overview

Qualifications

Responsibilities

Key facts

Hard skills

Other skills

About the company

Company details

Links

Your match analysis

Job description

Description

Requirements

Apply once. Then go straight to the hiring manager.

Site Reliability Engineer (SRE) Related jobs

Senior Site Reliability Engineer

Staff Site Reliability Engineer-Observability

Site Reliability Specialist, IT Operations

Senior Site Reliability Engineer (Arlington, VA) - Secret Clearance Required - Relocation Provided

IT Infrastructure Support Site Reliability Engineer II

Other jobs at i4DM

Edifecs Developer

MUMPS Developer

Senior Cloud Engineer

Reach out to the hiring manager directly.