Match score not available

SRE/ Site Reliability Engineer (Middle / Senior)

72% Flex
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

5+ years supporting infrastructure software and distributed systems, Experience in Golang, Python, Ruby, CI/CD, monitoring, containerization.

Key responsabilities:

  • Ensure smooth operation and improve performance of products
  • Identify bottlenecks, manage incidents, and set up alerting
  • Formulate SLI, SLO, minimize system recovery time and analyze incidents
  • Contribute to infrastructure code, network security, and maintain blockchain nodes
Bitquery  logo
Bitquery Information Technology & Services Startup https://bitquery.io/
11 - 50 Employees
See more Bitquery offers

Job description

Logo Jobgether

Your missions

Bitquery is an API-first product company dedicated to powering and solving blockchain data problems using ground truth, and on-chain data. Bitquery extracts and presents valuable data via APIs. These APIs are delivering solutions to multiple verticals like Decentralize Finance (DeFi), DEX Arbitrage Analytics, Crypto Surveillance & Forensics across all major blockchains like Bitcoin, Ethereum, EOS, and Tezos.

We are an international company of developers of software for the analysis of decentralized data (40+ chains). Bitquery is a distributed team. Currently, are looking for a full-time SRE engineer to further develop/monitor/support the infrastructure, and automation of various processes. Also, you can be on duty with shift time.


Roles & Responsibilities:

  • Ensuring the smooth operation of software, environments and company services
  • Analyzing and improving the performance and availability of products
  • Identification of bottlenecks in the architecture and in the infrastructure
  • Improvement of system alerting and incident management
  • Improvements of the monitoring systems based on SLI (Prometheus, Icinga, Grafana etc.)
  • Formalization of SLI under the main business requirements
  • Formation of SLO for services and infrastructure in general
  • Minimization of system recovery time (RPO and RTO)
  • Analysis of incidents in the prod environment
  • Capacity management

Requirements

  • 5+ years of work experience implementing, troubleshooting, and supporting infrastructure software and distributed systems
  • Support experience software in Golang, python , Ruby
  • Worked with virtualization and containerization technologies (containerd, docker, k8s) for more than 2 years
  • Set up CI of varying complexity (Jenkins) with CD to different environments
  • Experience in creating and maintaining a fault-tolerant system, with log coverage, monitoring, and alerting
  • Understanding the principle of "infrastructure as code" and the ability to test it (Ansible Terraform)
  • Principles of organizing network security (IPsec, WAF, IPS)
  • Experience with maintenance of blockchain nodes
  • Availability in US timezone is required

Our Tech Stack:

  • Infrastructure: Bare-metal / AWS
  • Databases: Clickhouse / MySQL
  • SCM: git / GitHub
  • Message broker: Kafka
  • Repository: Nexus
  • CI/CD: Jenkins
  • Monitoring: Icinga 2, Grafana, Prometheus, Victoria metrics, ELK
  • Orchestration: k8s, Ansible, Terraform
  • Containers: LXC, Docker
  • Scripting: Python, Golang, Ruby, Groovy
  • OS: Debian/Ubuntu
  • Others: Docker compose, IPSec

Benefits

  • Opportunity to work & collaborate with a truly global team spread across 5 countries
  • Work from anywhere in the world
  • Choose your own work hours
  • Yearly trip with Bitquery team to any remote destination

Being a startup we take decisions & move fairly fast, while giving candidates great experience with the interview process. We have a flat hierarchy in the organization where we empower individuals and provide an opportunity to deliver results as per his/her working style. Come and join a great culture and build Bitquery with us.

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Soft Skills

  • Proactive Mindset
  • Analytical Thinking
  • Problem Solving
  • Time Management

Go Premium: Access the World's Largest Selection of Remote Jobs!

  • Largest Inventory: Dive into the world's largest remote job inventory. More than half of these opportunities can't be found on standard platforms.
  • Personalized Matches: Our AI-driven algorithms ensure you find job listings perfectly matched to your skills and preferences.
  • Application fast-lane: Discover positions where you rank in the TOP 5% of applicants, and get personally introduced to recruiters with Jobgether.
  • Try out our Premium Benefits with a 7-Day FREE TRIAL.
    No obligations. Cancel anytime.
Upgrade to Premium

Find more Site Reliability Engineer jobs