Match score not available

Site Reliability Engineers - Monitoring

72% Flex
Remote: 
Full Remote
Work from: 

Offer summary

Qualifications:

Hands-on Linux skills, Familiarity with monitoring tools, Knowledge of networking concepts, Experience with automation and scripting, Commercial experience in system administration.

Key responsabilities:

  • Develop and maintain monitoring systems
  • Create real-time dashboards
  • Troubleshoot and resolve incidents
  • Analyze system performance trends
  • Participate in on-call rotations

Job description

Logo Jobgether

Your missions

Tech Stack: Linux
Categories: DevOps Monitoring SRE
HQ Locations: Bulgaria Sofia
Site Reliability Engineer (SRE) - Monitoring

Full time, Permanent, Hybrid Job in Sofia, Bulgaria 

Remote IT World helps Tech and Blockchain Professionals to get hired for 100% remote jobs.

We are a first-choice staffing partner of high-growth startups and scale-ups worldwide.

Ready to embrace freedom and flexibility?

Read on.

We’re building a new technical support/ sre - monitoring team and looking to hire 6  smart, reliable and ambitious people to join as 

Site Reliability Engineer - Monitoring 

Join an innovative, dynamic software company based in Sofia, Bulgaria. We provide B2B services to airlines, passenger service systems, and a variety of travel companies. And we are very good at it. Our solutions are the most technologically advanced in the travel technology market. Our clients are major global companies around the world. 

The company culture promotes innovation, initiative, streamlined communication and decision making. If you have great ideas, you will have the opportunity to research, get approval for them and implement them quickly. 

Job Scope

We are seeking a few highly motivated Site Reliability Engineers (SRE) to join our team. As a SRE- Monitoring, you will work closely with all members of the Infra team to ensure that our systems are monitored and meet our Service Level Agreements (SLAs). 

Main Responsibilities
  • Assist in the development and maintenance of monitoring systems to track our systems' health and performance.
  • Work with development teams to ensure that applications are designed with monitoring in mind.
  • Assist and be part of the building and maintaining dashboards that provide real-time visibility into system performance and availability.
  • Respond to alerts and incidents, troubleshoot issues, and work with cross-functional teams to resolve them.
  • Аnalyze trends in system performance and proactively identify potential issues before they occur.
  • Assist in developing and maintaining Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for our systems.
  • Monitor and report on our SLA compliance, and work with teams to identify areas for improvement.
  • Develop and maintain runbooks and documentation for incident response and resolution.
  • Participate in on-call rotations to ensure 24/7 availability of our systems.
  • Work closely with Senior SREs to continuously evaluate and improve our monitoring and incident response processes.
Requirements 
  • Handson skills in Linux environment.
  • Knowledge of monitoring tools ex. Prometheus, Grafana etc..
  • Understanding of networking concepts ex. TCP/IP and/or DNS etc.
  • Experience with automation and scripting languages such as Python or Bash is considered a plus.
  • Previous commercial experience in system administration and/or technical support - Is considered as a big plus

You will be ideal if you have:

  • Familiarity with cloud technologies
  • Ansible experience
  • Understanding in GIT
  • Knowledge in containerization orchestrators ex. Kubernetes
Skills and Attributes
  • Proactive attitude and responsible personality 
  • Excellent communication and collaboration skills.
  • Ability to resolve incidents while directly working with clients.
  • Willingness to work on shifts.  
  • Advanced spoken and written English language
Company Offer
  • Opportunity to expand knowledge and skills in the DevOps methodology
  • Chance to be among the first team members
  • Attractive compensation package
  • Company provided equipment
  • Private health insurance 
  • Access to Multisport card 
Interview Process
  1. Preliminary interview with HR 
  2. Technical Interview with the Monitoring Team Lead
  3. Final round with the C-level
  4. Offer 

If you are passionate about monitoring and meeting SLAs, and have the skills and experience to excel in this role, we would love to hear from you. Apply today and join our team of dedicated SREs!

Apply Now

For Site Reliability Engineers - Monitoring only shortlisted candidates will be contacted. 

Your job search is strictly confidential.

🔎 View more remote job openings.

👉 Subscribe to our weekly job alerts and be the first to hear about the latest web2 and web3 remote job offers.

Required profile

Experience

Spoken language(s):
EnglishFrench
Check out the description to know which languages are mandatory.

Soft Skills

  • Proactive Attitude
  • Excellent Communication
  • Team Collaboration
  • Responsibility

Go Premium: Access the World's Largest Selection of Remote Jobs!

  • Largest Inventory: Dive into the world's largest remote job inventory. More than half of these opportunities can't be found on standard platforms.
  • Personalized Matches: Our AI-driven algorithms ensure you find job listings perfectly matched to your skills and preferences.
  • Application fast-lane: Discover positions where you rank in the TOP 5% of applicants, and get personally introduced to recruiters with Jobgether.
  • Try out our Premium Benefits with a 7-Day FREE TRIAL.
    No obligations. Cancel anytime.
Upgrade to Premium

Find more Site Reliability Engineer jobs