The Lead Observability Engineer is a critical role in our organization, dedicated to ensuring the robustness, performance, and scalability of our infrastructure and applications through superior monitoring and observability practices. This role will work closely with Infrastructure as Code (IAC) tooling like Terraform and will have a strong understanding of open telemetry standards. Familiarity with current monitoring and logging tools like Dynatrace and Splunk is essential. This role is ideal for a proactive go-getter who is eager to drive new technology adoption within the organization.
Summary of Primary Responsibilities
• Lead the design and implementation of observability solutions that provide deep insights into application performance, system health, and user experience.
• Establish and advocate for observability best practices across engineering teams.
• Work closely with the infrastructure teams to automate and optimize infrastructure provisioning and scaling using IAC tools like Terraform.
• Ensure infrastructure code is tested, reliable, and efficient.
• Champion the adoption of open telemetry standards to collect, process, and export telemetry data.
• Utilize and integrate monitoring tools like Dynatrace and Splunk to provide thorough insights and analytics.
• Drive the evaluation and adoption of new tools and technologies to keep the organization at the forefront of observability and monitoring practices.
• Collaborate with various engineering teams to ensure smooth adoption and transition to new technologies.
• Analyze existing monitoring and observability practices, identifying areas for improvement or optimization.
• Foster a culture of continuous learning and improvement within the observability team and across the organization.
• Provide leadership, guidance, and mentoring to the observability team.
• Foster a collaborative and inclusive environment that encourages innovation and growth.