Job description

This is a remote position.

System Reliability Engineer (Application Support)

Experience – 5 to 8 years

Location – Remote – Australia

Up to 110 Aus Dollars +Super

Who are we

Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.

The Role

Provide L2 support to production system like application, database, middleware components, infrastructure and network components
Manage productions incidents end-to-end within defined SLAs with focus on resolution rather than who caused it.
Interact with various stake holders such as Release managers, program leads, service managers, development and test leads
Review operational readiness requirements such as monitoring and alerting, log rotation and resilience of the components and report the gaps
Provide pre-implementation support with activities such as release notes review and implementation dry runs.
Protect production components by running health checks, monitoring latency and memory utilization.
Automate day-to day activities and propose changes that improve reliability
Participate in CAB and provide feedback on change requests
Support the DevOps team in testing the promote pipelines and suggest automation of configuration items.
Practice incident management best practices and perform RCA.
Participate in disaster recovery tests and operational acceptance tests
Analyze the technology stack that makes up the product and optimize recovery time objective.
Work with team members spread across and time zones
Share knowledge, document improvements and mentor junior resources

Requirements

Deployments MTF/Prod
Maintenance items (including stop/start, Disaster Recovery-related activities, etc.)
Monitoring
Support TRTs
Incident creation
CR for changes in MTF/Prod

Tools

Log Monitoring Tool - Splunk
Application Monitoring tool - Dynatrace
Ticketing incident/problem management tool - Remedy
Linux
SQL
Dev-ops Basics - CI-CD Basics, Overview of git, Bit bucket, SonarQube, Fortify, CI(Jenkins), ARA, Saltstack, Chef, Artifactory