Match score not available

SRE Team Lead

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Minimum of 2 years experience in SRE or DevOps leadership, Hands-on experience with cloud infrastructure and tools, Strong Linux/UNIX OS administration skills, Experience with CI/CD pipelines and scripting, Proficient in monitoring, logging tools, and network protocols.

Key responsabilities:

  • Lead and mentor a team of Site Reliability Engineers
  • Standardize and monitor SRE practices for efficiency
  • Enhance system reliability through architecture decisions
  • Work with developers on feature prototyping and deployment
  • Manage and configure cloud-based servers and infrastructure
Devexperts logo
Devexperts SME https://devexperts.com/
501 - 1000 Employees
See more Devexperts offers

Job description

Company Description

Devexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.
By becoming a part of Devexperts, you’ll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our teams work together to create the next generation of financial software solutions. We welcome all candidates who believe, as we do, that innovation is grounded in education.

Job Description

We are looking for a Site Reliability Engineers Team Lead (SRE TL) to join the team that develops and supports a few big trading platforms.

Qualifications

We expect the Site Reliability Engineer Team Lead (SRE TL) to:

  • Lead and manage a team of 2-5 Site Reliability Engineers, providing guidance, mentorship, and support to ensure the team’s success.
  • Take ownership of, standardize, and monitor our SRE capability practices, ensuring that SRE engineers effectively implement and operate these practices.
  • Leverage your strong background in cloud distributed computing and reliability systems architecture to enhance the reliability and resilience of our systems.
  • Work closely with developers for prototyping, and designing new features as part of the infrastructure,
  • Deploy, install, configure, and maintain sophisticated Trading/Finance and related software,
  • Configure bare metal & сloud instances by using Infrastructure as Code,
  • Make key decisions for scalability, reliability, and accessibility,
  • Install and manage in-house developed and external well-known monitoring systems,
  • Design, deploy, and configure cloud-based servers and networks provision servers and storage, configure firewalls, VPN, monitoring, etc.,
  • Administrate UNIX/Cloud infrastructure – installation, configuration and maintenance,
  • Work with the Nexus and GIT repositories.

Must-have skills:

  • Excellent communication and collaboration skills to work effectively with cross-functional teams and delivery squads.
  • Minimum of 2 years of experience leading a Site Reliability Engineering (SRE) or DevOps team
  • Experience with support of JVM application (garbage collection, memory leaks),
  • Strong experience with OS-level administration on Linux and/or UNIX,
  • Hands-on scripting experience with Bash, Python, and/or Groovy,
  • Experience with configuring TeamCity CI/CD pipelines,
  • IAAS solutions using Ansible, Terraform,
  • Experience with Docker containers orchestrating (K8S/OpenShift/Hashicorp),
  • Know how to read and analyze errors,
  • In-depth knowledge of TCP/IP and ISO/OSI stack,
  • Experience with monitoring and logging tools (Zabbix, Elasticsearch or Opensearch, Grafana, Kibana, etc),
  • Experience in working with Apache, Nginx, HAproxy, Envoy, etc,
  • Strong ability to solve problems using code and scripting,
  • English level not lower than B2.

Nice-to-have skills:

  • Experience with SQL-like command language,
  • Experience with Ansible (AWX),
  • Knowledge of Java programming language,
  • Experience with trading/exchange/risk management software usage,
  • Experience with Atlassian software (JIRA, Confluence, FishEye, etc.).

Additional Information

Care for the employees is one of Devexperts' core values. For the suggested position, we offer a benefits package that will guarantee the comfort of our new teammate.

Work Regime Flexibility benefits: 

  • Possibility of hybrid/remote work mode,

  • Flexible working hours,

  • Work From Anywhere Program.

Health and recreation benefits: 

  • Fully paid additional wellness days (3 unwell days per year),

  • Medical insurance for the employees and children,

  • Reimbursement of fitness / Urban Sports Club Membership,

  • Meal allowance (Coverflex Card),

  • Flexpay system (Coverflex).

Facility benefits: 

  • Modern office with new equipment,

  • PlayStation, table football, and musical instruments in the office,

  • Parking spaces/transport reimbursement,

  • Free drinks and snacks.

Community benefits: 

  • Teambuilding activities,

  • Corporate parties,

  • Football Club,

  • Music club,

  • Speakers' club,

  • Free admission to corporate external events,

  • Possibility of joining conferences and professional fairs,

  • Personal branding development support.

Professional training benefits: 

  • English language courses,

  • Local language courses for foreign employees,

  • Unlimited access to self-learning platforms,

  • Certification opportunities,

  • Mentorship Program.

Social benefits: 

  • Parental bonus,

  • Pension plan (Coverflex),

  • Referral bonus,

  • Blood donation paid leave.

 

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
EnglishEnglish
Check out the description to know which languages are mandatory.

Other Skills

  • Organizational Skills
  • Problem Solving
  • Team Leadership
  • Mentorship
  • Verbal Communication Skills

Site Reliability Engineer (SRE) Related jobs