Offer summary

Qualifications:

BS/MS in Computer Science or engineering, or 5+ years of relevant experience., At least 3 years of hands-on experience with monitoring tools like Dynatrace or ELK., Experience designing data pipelines with tools like Filebeat, Logstash, or Fluentd., Proficiency in scripting languages such as Python, Bash, or PowerShell for automation..

Key responsibilities:

Maintain and deploy monitoring and alerting systems.

Design, configure, and maintain large-scale log aggregation solutions.

Set up and manage data ingestion pipelines and transformations.

Participate in troubleshooting, capacity planning, and performance analysis.

Job description

Location: Fully remote within the US
Long term contract (2+ years)
Must be a US Citizen

Our clients Enterprise Monitoring team is looking for a senior Level Observability engineer. The team is responsible for enterprise infrastructure, application, and network monitoring for onprem, hybrid, and various Clouds. The selected candidate will be joining a team of skilled engineers with a broad background in enterprise monitoring and Observability.

Your Impact:

As an Observability Engineer, this role is focused on maintaining the reliability, scalability and availability of our Log management solution as well as our Metrics and Observability platform which heavily uses automation (terraform, Ansible and scripts), this role requires maintaining performance KPI of our solutions and defining their SLOs.

Responsibilities:
Maintain and deploy monitoring and alerting.
Design, configuration and maintenance of log aggregation solution at a large scale.
Set up and manage ingestion pipelines and data transformations
Have the mindset of automate any task”.
Monitoring and Alerting: Build and maintain robust monitoring systems using tools like Elk, Dynatrace, Prometheus, OTEL and Grafana to detect potential issues early and trigger alerts for timely response.
Maintain associated documentation as it applies to our audit and certification requirements
Participate in troubleshooting, capacity planning, and performance analysis activities
Research new monitoring requirements and in many cases write code for that.
Strong expertise in setting up monitoring policiesrulestemplates; and writing scripts to accomplish monitoring requirements.

What you need to succeed:
BSMS in CSengineering or equivalent, OR 5+ years of experience.
3+ years of experience working directly with monitoring tools as either an Admin, SME or as an Architect, preferably with Dynatrace andor ELK.
Handson experience with designing data pipelines using filebeat, Logstash andor fluentbitfluentd.
Expert level with Either Dynatrace (managed, cloud as well as offline, with full scope of best practices and setup as it relates to Active gate, cloud, onprem and custom with workflows), or with Elastic onprem and cloud with best practices around the platform.
Fluent in writing scripts in languages like Python and (Bash or powershell) to automate tasks.
Experience in Terraform and Ansible. Syntax, best practices, and managing complex configurations in multi commercial and Gov clouds to build and manage infra and applications.
Very good working knowledge with Linux OS.
Highly selfmotivated and directed
Good analytical and problemsolvingtroubleshooting abilities.

Helpful Skills:
Knowledge of SNMP, TCP dump and tracing.
Knowledge of AIOPS platform.
Other scripting experience (JavaScript, Java, PowerShell, or others)