Key Facts

Remote From:

New York (USA)

Category: Site Reliability Engineer (SRE)

Full time

Senior (5-10 years)

English

Roles & Responsibilities

6–10 years of experience in Site Reliability Engineering, DevOps, or Production Support roles.
Strong hands-on expertise in Dynatrace including monitoring, alerting, dashboards, and problem analysis.
Solid understanding of observability, logging, and monitoring frameworks.
Experience with cloud platforms such as AWS, Azure, or GCP.

Requirements:

Design, implement, and manage end-to-end monitoring solutions using Dynatrace.
Configure alerting, dashboards, problem detection, and performance optimization strategies.
Monitor application health, infrastructure performance, and user experience across distributed systems.
Troubleshoot production incidents and perform root cause analysis for system and application issues.

Georgia IT, Inc.

About Georgia IT, Inc.

Georgia IT, Inc. provides IT Consulting for a wide range of IT services and custom build turn-key enterprise solutions. GIT specializes in improving business scalability and efficiency through BSM and SBA Solutions. GIT transforms business with service management and service automation solutions. We are BMC & HP partners. GIT Services include custom built enterprise software and customer-centric web portals, network design and implementation, remote and site-to-site VPN, network and server security assessment and setup, server and desktop virtualization, and many others. GIT also provides IT Consulting in many areas such as Disaster Recovery Planning, Enterprise Data Backup Strategy, Long Term Strategic IT Planning and Augmentation Professional Services Solutions. Professional Services Specialties: Business Process Integration, Software Integration & Development, Web portal, Networking, Remote and site-to-site VPN, Virtualization Solutions, Product Development Please contact us Hrus@georgiait.com / uthay@georgiait.com/470-798-5000 x 1010 / (732) 890-2535 direct

Founded: 2018

Company size: 51 - 200

Website LinkedIn See all jobs →

Job description

Site Reliability Engineer (SRE) – Dynatrace Remote (New York, USA)
Location: Remote

(New York, USA)
Experience Required: 6–10 Years
Job Summary
We are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in Dynatrace and modern observability practices. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of enterprise applications and infrastructure across cloud and hybrid environments. This role requires hands-on experience with monitoring, automation, cloud platforms, CI/CD pipelines, and containerized environments.
Key Responsibilities

Design, implement, and manage end-to-end monitoring solutions using Dynatrace.
Configure alerting, dashboards, problem detection, and performance optimization strategies.
Monitor application health, infrastructure performance, and user experience across distributed systems.
Troubleshoot production incidents and perform root cause analysis for system and application issues.
Collaborate with DevOps, Cloud, and Engineering teams to improve system reliability and operational efficiency.
Automate operational tasks and monitoring workflows using scripting languages such as Python, Bash, or Shell.
Support and optimize cloud-based environments on AWS, Azure, or GCP.
Manage and troubleshoot Linux/Unix-based systems.
Work with containerization and orchestration technologies including Docker and Kubernetes.
Build and maintain CI/CD pipelines using tools such as Jenkins, GitLab CI/CD, or Azure DevOps.
Ensure observability best practices across microservices and distributed architectures.
Participate in on-call support and incident response activities as needed.

Required Skills & Qualifications

6–10 years of experience in Site Reliability Engineering, DevOps, or Production Support roles.
Strong hands-on expertise in Dynatrace including monitoring, alerting, dashboards, and problem analysis.
Solid understanding of observability, logging, and monitoring frameworks.
Experience with cloud platforms such as AWS, Azure, or GCP.
Strong knowledge of Linux/Unix systems administration and troubleshooting.
Experience with Docker and Kubernetes in enterprise environments.
Proficiency with CI/CD tools including Jenkins, GitLab, or Azure DevOps.
Strong scripting and automation skills using Python, Bash, or Shell scripting.
Understanding of microservices architecture and distributed systems.
Excellent troubleshooting, analytical, and communication skills.

Preferred Qualifications

Experience implementing SRE best practices and reliability engineering principles.
Knowledge of Infrastructure as Code (Terraform, Ansible, etc.) is a plus.
Exposure to enterprise-scale monitoring and cloud-native technologies.
Relevant cloud or Dynatrace certifications are an advantage