Match score not available

Senior/Lead Site Reliability Engineer (Remote)

unlimited holidays - extra holidays - extra parental leave - long remote period allowed
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 
Asia

Offer summary

Qualifications:

Experience in systems and software engineering in international settings, Proficiency in Linux and at least one major programming language like Golang, Python, Java/Scala, Knowledge of cloud providers like Google Cloud or Amazon AWS, Experience with containerization technologies and automation of infrastructure provisioning.

Key responsabilities:

  • Contribute to design and implementation efforts, focusing on complex technical challenges and scalability
  • Minimize incidents impacts by monitoring, being security-conscious, and participating in on-call rotation
  • Design and implement infrastructure and services for developer teams, ensuring resilience
  • Manage logging, monitoring systems at scale with Prometheus/Thanos
NFQ logo
NFQ SME https://nfq.com/
501 - 1000 Employees
See more NFQ offers

Job description

At NFQ, we're all about developing cutting-edge apps, CRMs, ERPs, and other cross-platform products. Both for ourselves and for our clients that include HomeToGo, Kayak, Alaiko, Home24, and many others. We specialize in e-commerce, mobility, and transport & logistics, and we're always eager to tackle new challenges. Whatever the area – from Mobile to UX  – we've got a team that knows it inside out.
Join our team of 800+ professionals across Germany, Poland, Lithuania, Vietnam, Thailand, Singapore, and Egypt. Make your own way with us.

The Job:
Join the Cloud Tools team and work together with smart colleagues from Germany and Asia! We provide, protect and progress software and services required for operating the commercetools platform in the various clouds, preferably using technologies such as Kubernetes, Serverless platforms or PaaS — with an ever-watchful eye on their availability, capacity, and performance.
Our customers are internal developers and other teams using our services.
Our top five supported groups of services in a nutshell.
● Logs - Indexing log events, querying/search log events.
● Metrics - Collecting metrics, querying metrics, alert handling.
● Security - Authentication (SSO), automated SSL and key handling.
● CI/CD - Build pipeline for services and software provided by our team.
● SLOs - Infrastructure for providing SLO metrics and SLO reports.
We have regular stand-ups, pair programming and knowledge-sharing sessions. Our team culture is defined by transparency, support, openness to all ideas and voices and continuous learning.

In this role you will
  • Contribute to engineering efforts from design to implementation, solving complex technical challenges around developer and engineering productivity and velocity.
  • Design and implement resilient and scalable infrastructure and services used by our development teams at Commercetools.
  • Minimize incident impacts by being informed upfront with monitoring, logs and metrics and having an eye on IT standards and security.
  • Take part in on-call rotation (with the worldwide distributed team) for production systems.

  • What you will bring
  • Experience as Systems Engineer and Software Engineer in an international team.
  • Strong Linux skills and excellent skills in one major programming language (Golang, Python, Java/Scala would be great - mainly because we use the first two most of the time, and the other two are fine since our backend is JVM based).
  • Experience with one of the major cloud providers (Google Cloud, Amazon AWS).
  • Knowledge of containerization and container orchestration technologies such as Google Kubernetes Engine (GKE) or Amazon Elastic Kubernetes Service (EKS).
  • Experience automating infrastructure provisioning, DevOps, and continuous integration/delivery.
  • Experience with managing logging and monitoring systems (Prometheus/Thanos) at scale.
  • Mentality to share and the aspiration to constantly improve yourself and learn new things.
  • Fluent English to work in an international, cross-functional team.
  • Nice to have:
  • MongoDB and ElasticSearch operational experience.
  • Advanced JVM knowledge.
  • Experience with managing cloud provider infrastructure using Terraform.

  • Why you'll love working here
  • At NFQ, we understand that we spend a significant portion of our lives at work. That’s why we strive to create an environment where everyone is valued and challenged to contribute their best. We ensure that every team member has the opportunity for personal growth, sharpening and expanding their skills regularly.
  • We are proud to have a diverse team, with 13 different nationalities represented, and we believe that we can bring out the best in each other when we combine everyone's strengths.
  • We are committed to creating meaningful and healthy relationships with and among our coworkers and clients, and we put all our energy into achieving excellence by creating strong relationships between brilliant minds worldwide.
  • Benefits include:
  • Laptop is provided.
  • Hybrid work.
  • English class for career development.
  • A fun & dynamic environment and freedom to be creative.
  • Modern office with the flexible relaxing zone.
  • 13th-month salary (based on company policies and business situation).
  • Performance review 2 times/ year.
  • Extra Premium Healthcare & Annual Health-check.
  • Loyalty Program: life insurance worth 1 billion VND.
  • 15 days annual leaves, working Monday – Friday.
  • Required profile

    Experience

    Level of experience: Mid-level (2-5 years)
    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Other Skills

    • Verbal Communication Skills
    • Open Mindset

    Site Reliability Engineer (SRE) Related jobs