Site Reliability Engineer

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5 to 8 years of experience in system reliability or application support., Proficiency with Linux, SQL, and DevOps tools like Jenkins, Git, and SonarQube., Knowledge of monitoring tools such as Splunk and Dynatrace., Bachelor's degree in Computer Science, Information Technology, or related field..

Key responsibilities:

  • Provide Level 2 support for production systems, including applications, databases, and infrastructure.
  • Manage incidents end-to-end within defined SLAs, focusing on resolution.
  • Review operational readiness and report gaps in monitoring, alerting, and resilience.
  • Automate routine activities and support change management processes.

Tiger Resourcing Group logo
Tiger Resourcing Group TPE http://www.tiger-it.com
11 - 50 Employees
See all jobs

Job description

This is a remote position.

System Reliability Engineer (Application Support)

Experience – 5 to 8 years

Location – Remote – Australia

Up to 110 Aus Dollars +Super

Who are we

Fulcrum Digital is an agile and nextgeneration digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.

The Role

  • Provide L2 support to production system like application, database, middleware components, infrastructure and network components
  • Manage productions incidents endtoend within defined SLAs with focus on resolution rather than who caused it.
  • Interact with various stake holders such as Release managers, program leads, service managers, development and test leads
  • Review operational readiness requirements such as monitoring and alerting, log rotation and resilience of the components and report the gaps
  • Provide preimplementation support with activities such as release notes review and implementation dry runs.
  • Protect production components by running health checks, monitoring latency and memory utilization.
  • Automate dayto day activities and propose changes that improve reliability
  • Participate in CAB and provide feedback on change requests
  • Support the DevOps team in testing the promote pipelines and suggest automation of configuration items.
  • Practice incident management best practices and perform RCA.
  • Participate in disaster recovery tests and operational acceptance tests
  • Analyze the technology stack that makes up the product and optimize recovery time objective.
  • Work with team members spread across and time zones
  • Share knowledge, document improvements and mentor junior resources
    • Requirements

      • Deployments MTFProd
      • Maintenance items (including stopstart, Disaster Recoveryrelated activities, etc.)
      • Monitoring
      • Support TRTs
      • Incident creation
      • CR for changes in MTFProd
        • Tools

          • Log Monitoring Tool Splunk
          • Application Monitoring tool Dynatrace
          • Ticketing incidentproblem management tool Remedy
          • Linux
          • SQL
          • Devops Basics CICD Basics, Overview of git, Bit bucket, SonarQube, Fortify, CI(Jenkins), ARA, Saltstack, Chef, Artifactory



            • Salary:

              110,000

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Teamwork
  • Communication
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs