Match score not available

Career Opportunities: Senior Lead Site Reliability Engineer (337420)

unlimited holidays - extra parental leave
Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Bachelor’s degree or equivalent in engineering, computer science, or related field., 10+ years of related experience required., 2+ years' experience with DevOps, GIT, CICD and/or related certifications., Experience with Java/J2EE, Microservices, Spring, SpringBoot, REST APIs, and APM solutions like AppDynamics..

Key responsabilities:

  • Acts as a Subject Matter Expert (SME) for enterprise applications and their hosting environments.
  • Ensures enterprise applications are continuously monitored and restored quickly in case of outages.
  • Reports on system status and availability, providing tuning recommendations as needed.
  • Collaborates with various teams to manage production environments and lead postmortem incident response sessions.

CenturyLink logo
CenturyLink XLarge http://www.centurylink.com/
10001 Employees
See all jobs

Job description

 

About Lumen

Lumen connects the world. We are igniting business growth by connecting people, data and applications – quickly, securely, and effortlessly. Together, we are building a culture and company from the people up – committed to teamwork, trust and transparency. People power progress.

 

We’re looking for top-tier talent and offer the flexibility you need to thrive and deliver lasting impact. Join us as we digitally connect the world and shape the future.

The Role

Operates company's complex high traffic, business critical internet site communications and/or network-based (cloud) product systems. Plans, designs, and implements scalable local and wide-area network solutions between multiple platforms and protocols (including IP and VOIP). Responsible for system performance; supports/troubleshoots network issues and coordinates installation of such items as routers and switches with appropriate vendors. Develops tools to automate the deployment, administration, and monitoring of a network system.

Location

This is a work from home position within Canada. 

The Main Responsibilities
  • Acts as a Subject Matter Expert (SME) for enterprise applications as well as their various hosting environments for organizational projects that involve internal customers, including Application Support, Business and Development teams.
  • Ensures enterprise applications are always running effectively, which includes continuous monitoring and alerting with organizational toolsets and ensuring every reasonable effort is made to restore service as quickly as possible in the event of an outage.
  • Responds to non-critical production issues in a timely and proactive manner to ensure they are resolved before impacting the business in a more profound way the escalates into a critical situation.
  • Alerts and accurately reports KPI's on systems status and availability with tuning recommendations if appropriate at regular intervals using log and APM analysis (IE uptime, Mean time to repair, Mean time between repairs, Probability of failure, etc.).
  • Creates detailed design and runtime (runbooks) documents for associated solutions built around current and new technologies.
  • Maintains up-to-date configuration documentation, diagrams, SOPS, work/incident tickets, training materials and other supporting documentation.
  • Provides and implements solutions for application automation and configuration management using orchestration tool sets such as Jenkins to streamline assembly and deployment, reducing delivery time and reduce errors (commonly known as CICD).
  • Ensures effective release management, change management and quality assurance in project lifecycles are maintained for applications using an ITIL centric approach.
  • Engages and collaborates regularly with infrastructure engineering, software engineering, quality assurance, and other organizational teams (both IT and non-IT) to provide status updates and work on both tactical and strategic plans.
  • Demonstrates various application environment tuning abilities, and able to provide solutions to capacity requirements involving those components (JVM, Web Containers, DB Connections, HTTP servers, etc.).
  • Manages production and non-production environments effectively and accounts for differences in each environment for the purposes of deployments, functional testing, and load testing.
  • Participates and/or leads blameless postmortem incident response sessions to work with teams to determine RCA and long-term recommendations for remediation to infrastructure and application code.
  • Ensures adherence to configuration and operating best practices and standards.
What We Look For in a Candidate

Required:

  • Bachelor’s degree or equivalent in engineering, computer science, or related field.
  • 10+ years of related experience required.
  • 2+ years' previous experience with DevOps, GIT, CICD and/or related certifications.
  • 6-8 years of previous experience working with Java/J2EE, Microservices, Spring, SpringBoot, REST APIs, Postman.
  • Experience with an APM solution: AppDynamics (preferred), Dynatrace.
  • Experience working with infrastructure-based technologies (Load balancers, SSL, API Gateways, DNS, etc.).

 

Preferred:

  • Experience with AIOps tools like Splunk and Big Panda.
  • Experience with supporting containerized or cloud applications.

Requisition #: 337420

Background Screening

If you are selected for a position, there will be a background screen, which may include checks for criminal records and/or motor vehicle reports and/or drug screening, depending on the position requirements. For more information on these checks, please refer to the Post Offer section of our FAQ page. Job-related concerns identified during the background screening may disqualify you from the new position or your current role. Background results will be evaluated on a case-by-case basis.


Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

 

Equal Employment Opportunities

We are committed to providing equal employment opportunities to all persons regardless of race, color, ancestry, citizenship, national origin, religion, veteran status, disability, genetic characteristic or information, age, gender, sexual orientation, gender identity, gender expression, marital status, family status, pregnancy, or other legally protected status (collectively, “protected statuses”). We do not tolerate unlawful discrimination in any employment decisions, including recruiting, hiring, compensation, promotion, benefits, discipline, termination, job assignments or training.

 

Disclaimer

The job responsibilities described above indicate the general nature and level of work performed by employees within this classification. It is not intended to include a comprehensive inventory of all duties and responsibilities for this job. Job duties and responsibilities are subject to change based on evolving business needs and conditions.

 

In any materials you submit, you may redact or remove age-identifying information such as age, date of birth, or dates of school attendance or graduation. You will not be penalized for redacting or removing this information.

 

Please be advised that Lumen does not require any form of payment from job applicants during the recruitment process. All legitimate job openings will be posted on our official website or communicated through official company email addresses. If you encounter any job offers that request payment in exchange for employment at Lumen, they are not for employment with us, but may relate to another company with a similar name.

 

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Time Management
  • Teamwork
  • Communication
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs