Logo for SOFTSWISS

Monitoring System Engineer

Job description

Overview:

SOFTSWISS continues to expand the team and is looking for a Monitoring System Engineer.

If you're passionate about delivering top-notch service and consider yourself a proactive, positive thinker, we'd love to hear from you! We're eager for you to contribute to our team's success. If you're looking for a challenging and rewarding career opportunity, this could be the perfect fit.

Key responsibilities:

The two main pillars of our workflow are:

Responding to Events/Monitoring Alerts (L1/L2 tasks for certain system parts):

  • Offering on-duty service coverage, encompassing day and night shifts.

  • Addressing incidents by troubleshooting and resolving issues, even seeking assistance from third-party or vendor support when necessary.

  • Directing issues or queries to the relevant department as needed.

  • Keeping detailed records and documentation of current infrastructure challenges and Root Cause Analyses (RCAs).

  • Contribute to safe and effective internal practices for AI usage in monitoring and incident response workflows.

Maintaining and Enhancing the Monitoring Systems:

  • Collaborating with other teams to understand and define their monitoring needs, then implementing the right solutions.

  • Setting up and adjusting the monitoring/observability systems for various teams.

  • Designing and tweaking alerts and dashboards to suit specific needs.

  • Refining alerts to reduce irrelevant notifications and increase their significance.

  • Enhancing dashboards for better clarity, understanding, and a more comprehensive view.

  • Building and sustaining connections between the monitoring systems and other platforms like Jira, Opsgenie, etc. when required.

  • Establishing and updating a Knowledge Base, covering system configurations, alert processes, troubleshooting guidelines, and user manuals.

  • Staying updated with the newest trends and best practices to continuously uplift our organization's monitoring capabilities.

  • Identify opportunities to automate repetitive monitoring and support tasks, including with AI-assisted approaches where suitable.

Required Experience:

  • Minimum of 3 years experience as a Systems Engineer, SRE, DevOps, or Monitoring Support Engineer (L2+).

  • Good understanding of Linux-like operating systems (Debian-based).

  • Experience with containerization, virtualization, and orchestration (LXC/LXD, Docker, Kubernetes).

  • Development experience in any scripting language (Bash, Python, Go, etc) and familiarity with REST API.

  • Knowledge of basic database concepts (experience with PostgreSQL is preferable), including transactions and WAL.

  • English proficiency at an Intermediate (B1) level or higher. It's crucial to understand technical terminology related to our specific tech stack and to be able to interpret technical documentation.

  • Practical interest in using AI-assisted tools for troubleshooting, automation, documentation, and operational efficiency:
    - Ability to critically evaluate AI-generated output and validate it before using it in production environments.
    - Understanding of the risks and limitations of AI usage in infrastructure and production operations.

Skills & Experience

Monitoring/observability tools (experience with at least two of the following)

  • Zabbix (familiarity with concepts such as LLD, prototypes, dependencies, and preprocessing)

  • Grafana (knowledge of data sources, dashboard creation, and query usage)

  • Prometheus/VictoriaMetrics/etc. (understanding of metrics collection and alerting)

  • ELK/Splunk/etc. (ability to use queries and filters for log analysis)

  • Site24x7/Pingdom/etc. (experience with web monitoring and performance metrics)

Linux-like operating systems

  • Strong understanding of key concepts, including:

  • File systems

  • Process management

  • Built-in monitoring tools

  • Networks

  • Scripting

  • Troubleshooting

Familiarity with

  • Kafka

  • RabbitMQ

  • GitLab

  • Nginx/Puma

  • Clickhouse

  • PostgreSQL

  • MongoDB

  • Hashicorp Vault

  • Microservices and orchestration (Kubernetes)

  • Any IaC / infrastructure automation: Provisioning tools (Terraform); Configuration management (Ansible, Salt, Puppet)

Our Benefits:

  • Full-time remote work opportunities and flexible working hours

  • Private insurance

  • Additional 1 Day Off per calendar year

  • Sports program compensation

  • Comprehensive Mental Health Programme

  • Free online English lessons with a native speaker

  • Generous referral program

  • Training, internal workshops, and participation in international professional conferences and corporate events.

System Engineer Related jobs

Other jobs at SOFTSWISS

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.