Match score not available

Site Reliability Engineer III

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Undergraduate degree in Computer Science or related field, 7+ years of experience in technology roles, 4+ years in a DevOps culture or SaaS environment, Experience with C#, Go, Python, or Java, Familiarity with AWS services and IaC tools like Terraform.

Key responsabilities:

  • Design, develop, implement, and optimize system performance, reliability, and scalability
  • Measure health of environments using monitoring tools like Datadog
  • Enable teams on observability best practices and guidance
  • Educate engineering and operations teams on SRE principles
  • Proactively troubleshoot and resolve production issues
Vertex Inc. logo
Vertex Inc. Computer Software / SaaS Large https://www.vertexinc.com/
1001 - 5000 Employees
See more Vertex Inc. offers

Job description

Job Description:

      JOB SUMMARY:

      The Site Reliability Engineer helps Vertex to implement highly reliable, scalable, and performant system across the enterprise.  This is realized by relentlessly measuring the environments and finding areas that need improvement.  Improvements can range from education of engineering and operational resources, creating new capabilities, providing code enhancements, or implementing processes and tools. Success is measured by data and backed by continued customer satisfaction.  The Site Reliability Engineer will use their infrastructure experiences combined with engineering best practices to build solutions to improve our environment.

      ESSENTIAL JOB FUNCTIONS AND RESPONSIBILITIES:

  • Responsible for designing, developing, implementing, and optimizing the efficiency of the environment including performance, reliability, and scalability of our services.
  • Responsible for measuring the health and performance of the environments by implementing tooling such as Datadog to achieve the proper level of visibility of the environment.
  • Enable teams to implement observability by developing and publishing standards and best practices and providing guidance and implementation assistance to engineering teams.
  • Responsible for designing and implementing coding assignments related to applications, systems reliability, monitoring, alerting, and analytics.
  • Participate in educating Engineering and Operations teams to ensure SRE principles are implemented consistently across the enterprise.
  • Take a proactive approach to anticipate and correct a wide range of production issues including outages, processing slowdowns or stoppages, errors, and failures
  • Implement engineering and operational improvements including code enhancements, process improvements, or procedural amendments.
  • Ability to triage, isolate, and resolve environmental issues in an expedient and open fashion.
  • Provide technical leadership for a wide range of projects.
  • Assist and mentor other engineering staff

KNOWLEDGE, SKILLS AND ABILITIES:

  • Experience with multiple software development languages including C#, Go, Python or Java.
  • Experience with platform monitoring tools like Datadog, AWS CloudWatch, or similar
  • Experience with Software as a Service (SaaS) environments
  • Experience designing and deploying AWS services with an Infrastructure as Code (IaC) mindset with tools like Terraform.
  • Experience with hyperscalers, most notably AWS, Azure, or OCI
  • Experience in Agile development methodology.
  • Good written / verbal communication skills
  • Ability to listen and understand information and communicate the same.
  • Ability to network with key contacts outside own area of expertise.
  • Ability to work with minimal supervision, working with latitude for independent decision making.

EDUCATION, TRAINING:

  • Undergraduate degree preferably in Computer Science or a similar technical degree.
  • 7+ years of experience in technology related roles.
  • 4+ years of experience in a DevOps culture or production SaaS environment.

Other Qualifications – The Winning Way behaviors that all Vertex employees need in order to meet the expectations of each other, our customers, and our partners.

  • Communicate with Clarity - Be clear, concise and actionable. Be relentlessly constructive. Seek and provide meaningful feedback.
  • Act with Urgency - Adopt an agile mentality - frequent iterations, improved speed, resilience. 80/20 rule – better is the enemy of done. Don’t spend hours when minutes are enough.
  • Work with Purpose - Exhibit a “We Can” mindset. Results outweigh effort. Everyone understands how their role contributes. Set aside personal objectives for team results.
  • Drive to Decision - Cut the swirl with defined deadlines and decision points. Be clear on individual accountability and decision authority. Guided by a commitment to and accountability for customer outcomes.
  • Own the Outcome - Defined milestones, commitments and intended results. Assess your work in context, if you’re unsure, ask. Demonstrate unwavering support for decisions.
COMMENTS:

The above statements are intended to describe the general nature and level of work being performed by individuals in this position.  Other functions may be assigned, and management retains the right to add or change the duties at any time.

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Computer Software / SaaS
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Mentorship
  • Decision Making
  • Verbal Communication Skills
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs