Logo for MKS2 Technologies

Site Reliability Systems Engineer

Key Facts

Remote From: 
Full time
English

Other Skills

  • Microsoft Windows
  • Non-Verbal Communication
  • Critical Thinking
  • Virtual Teams

Roles & Responsibilities

  • Deep expertise (3+ years) in two or more of the following tools: Dynatrace, Splunk, SolarWinds, ServiceNow Operator Workspace
  • Extensive experience in one or more Technology Areas: Network, Windows, Desktop, Unix/Linux, AWS or Azure Cloud, WebSphere Middleware, Java/JS Development, Microsoft or Oracle Database
  • 8+ years of experience deploying, maintaining, and troubleshooting complex applications at an enterprise scale while working with cross-functional teams
  • HS diploma or GED and 20+ years of relevant professional experience or MA or MS degree in computer science, electronics engineering, or other engineering or technical discipline with 10+ years of relevant professional experience

Requirements:

  • Generate monitoring/observability recommendations through the analysis of monitoring related HPIs/CPIs
  • Utilize skills in enterprise-level triage and incident resolution while gaining experience in VA system infrastructure
  • Use modern system monitoring tools to improve VA enterprise reliability and improve service quality
  • Collaborate with developers and identity and access teams for deeper technical investigations

Job description

MKS2 Technologies, LLC, an award-winning high growth small business, creates innovative and customer-centric technology solutions in the areas of Cyber Security, Instructional Design and Training, Software Engineering and IT Support Services to improve the security and well-being of our clients. Our commitment to excellence and our “Mission First” orientation has resulted in steady growth and an expanding client base across government agencies. We have employees nationwide and for the past three consecutive years were named one of the fastest growing Veteran-owned companies in the nation. Please take a moment to browse through our website and learn more about what it means to serve with MKS2.


 

Position Title: Site Reliability Systems Engineer (REMOTE)

Program: funded through 2029

Pay: $100,000 - $110,000 - full benefits included 

Onboarding timeline: ~5 weeks from acceptance to start date
 
Interview Process: 1-2 MS teams Panel interviews 
 
Description: 
As a Site Reliability Systems Engineer on our team, your main role is to work with our IST/System Engineering Team (SET) to generate monitoring/observability recommendations through the analysis of monitoring related HPIs/CPIs from initial findings through detailed analysis and generation of actionable insights. Y
our role expectations and tasks in support of the IST/SET team will include, but not be limited to the following items:
  • Utilize your skills in enterprise-level triage and incident resolution while gaining experience in VA system infrastructure.
  • Use modern system monitoring tools to improve VA enterprise reliability and improve the quality of services provided to veterans.
  • Work with system and application owners to obtain existing design and functionality, leverage comprehension of workflow systems and applications processes within multiple system environments and work across technology and development teams to diagnose outages and recommend changes to increase reliability.
  • Use your hardware and software experience to help strengthen the systems the VA relies on. Your primary focus will be investigation, working with event management, application owners, DevOps teams, and system and network administrators to examine issues across enterprise applications and technology stacks.
  • Partner with system and application owners to understand their platform designs and how they operate across different environments. This insight will help you diagnose outages, trace workflow issues, and recommend changes that enhance stability.
  • Collaborate with developers and identity and access teams when deeper technical investigations are needed.
  • You’ll gain hands‑on experience with enterprise‑level triage and incident analysis, which will deepen your understanding of the VA’s infrastructure. Tools like SolarWinds, Dynatrace, and Splunk will be part of your daily workflow, giving you the visibility needed to identify reliability concerns and support improvements to the services delivered to veterans.

Must Have:

  • Deep expertise (3+ years) in two or more of the following tools used for troubleshooting application logging in an enterprise environment (Dynatrace, Splunk, SolarWinds, ServiceNow Operator Workspace)
  • Extensive experience in one or more Technology Areas (Network, Windows, Desktop, Unix/Linux, AWS or Azure Cloud, WebSphere Middleware, Java/JS Development, Microsoft or Oracle Database) 
  • 8+ years of experience working with key indicators for IT system operability, reliability, application performance, and code quality 
  • 8+ years of experience deploying, maintaining, and troubleshooting complex applications at an enterprise scale while working with cross-functional teams 
  • 1+ years of experience in service virtualization, AWS or Azure Cloud technologies, and SaaS and PaaS implementation. 
  • Experience with using Microsoft Office, including Word, Excel, and PowerPoint 
  • 2+ years independently leading a team to solve difficult technical challenges
  • HS diploma or GED and 20+ years of relevant professional experience or MA or MS degree in computer science, electronics engineering, or other engineering or technical discipline with 10+ years of relevant professional experience

Nice to Have: 

  • Experience with test-driven development, distributed systems, microservices and cloud-native application implementation 
  • Experience with the following tools: Oracle Enterprise Manager, Riverbed – Aternity, and ServiceNow VTBs
  • Possession of excellent written and verbal communication skills 
  • Possession of strong critical thinking and error assessment capabilities 
  • Virtual team management 
  • Public Trust Clearance 


 

Diversity creates a healthier atmosphere: MKS2 Technologies is proud to be an Equal Employment Opportunity / Affirmative Action employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law.

Site Reliability Engineer (SRE) Related jobs

Other jobs at MKS2 Technologies

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.