Minimum 6+ years of experience with Application Performance Management tools like Datadog and New Relic., Strong understanding of cloud-native solutions and public cloud services such as AWS and Azure., Familiarity with ITIL, ITSM, SRE, or DevOps practices, along with excellent problem-solving and communication skills., Experience in programming languages like Python, Java, and Go, and knowledge of containerization technologies like Docker and Kubernetes..
Key responsabilities:
Implement and maintain observability solutions for large-scale enterprise customers using tools like New Relic and Datadog.
Collaborate with cross-functional teams to build resilient systems and integrate observability practices into engineering workflows.
Provide technical support during post-sales processes, including installation and training, while identifying and resolving performance issues.
Document observability systems and prepare reports on system performance, while staying updated with the latest trends in observability and cloud technologies.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
AHEAD builds and manages digital platforms that power the most successful organizations in the world. Our consultative approach, unmatched engineering, and innovative solutions combine to accelerate the impact of technology in every client we serve.
We are looking for talented, creative, and proactive individuals who are passionate about solving complex business problems and contributing to the next generation of modern applications. Our goal is to help our customers understand the connections between application performance, user experience, and business outcomes, thereby creating exceptional customer experiences. Join us in shaping the future of Observability Engineering within our Intelligent Operations team with innovative data and integration solutions tools.
Experience
Minimum 6+ years of hands-on experience with Application Performance Management tools such as Datadog, New Relic, AppDynamics, Dynatrace, Splunk ITSI, Honeycomb, Chronosphere, Riverbed Aternity/Alluvio, ExtraHop, & Logic Monitor.
Hands-on experience with cloud-native, open-source solutions like Prometheus, Grafana, ELK stack/Elastic.io, OpenTelemetry (OTEL),
Experience with public cloud solutions like AWS CloudWatch, Azure App Insights, etc.
Strong understanding of network & system management solutions, distributed systems, networking, and database technologies.
Operational background and familiarity with ITIL , ITSM, SRE, or DevOps best practices and principles.
Excellent problem-solving skills, organizational, project management, and communication skills.
Eagerness to collaborate, contribute to team success, and a continuous learning mindset.
Experience with containerization and orchestration technologies like Docker and Kubernetes.
Broad background in software engineering with, at a minimum, generalist-level expertise in programming languages such as Python, Java, Go, .NET, NodeJS, Ruby, and PHP.
Familiarity with microservices architecture, service mesh technologies, and end-user technologies (iOS, Android, JavaScript, HTML5).
Knowledge of configuration management tools such as Terraform and Ansible
Roles and Responsibilities
Implement and maintain cutting-edge Observability solutions utilizing tools like New Relic, Datadog, AppDynamics, or Dynatrace for our large-scale enterprise customers.
Develop and maintain systems for effective monitoring, logging, and tracing, ensuring scalability and reliability.
Collaborate with cross-functional teams, including software engineers, product managers, and data scientists, to build resilient systems.
Integrate observability practices into different engineering workflows and lead the adoption, optimization, and integration of products within the customer’s business infrastructure.
Create custom dashboards, set up alerts, and develop AIOps rules, ensuring effective tracking against goals/KPIs.
Provide technical support in post-sales processes, including installation, deployment, training, technical check-ups, and escalation management.
Identify performance bottlenecks and anomalous system behavior and resolve root causes of service issues.
Stay updated with the latest trends in observability, logging, monitoring, and cloud technologies and introduce innovative solutions and best practices.
Participate in strategic technology planning, focusing on scalability, cost-effectiveness, and risk management in observability infrastructure.
Document observability systems and processes comprehensively and prepare reports for management on system performance and reliability.
Utilize Infrastructure as Code (IaC) principles for efficient infrastructure provisioning and management.
Required profile
Experience
Spoken language(s):
English
Check out the description to know which languages are mandatory.