Logo for smartclip

Site Reliability Engineer (f/m/d) – Observability & Internal Tools

Key Facts

Remote From: 
Full time
Entry-level / graduate
English

Other Skills

  • Accountability
  • Systems Thinking
  • Teamwork
  • Technical Curiosity
  • Problem Solving

Roles & Responsibilities

  • Observability mindset with a clear strategy for metrics, logs, and traces; transform noisy alerts into actionable insights.
  • Ownership: live the 'you build it, you run it' philosophy; avoid ticket ping-pong and excuses.
  • Experience with GKE or EKS and Jenkins, Ansible, or Terraform; designing and operating production-grade setups on GCP or AWS.
  • Contributions to open-source projects or demonstrated open-source involvement.

Requirements:

  • Take full ownership of smartclip's internal utility and platform tooling; evolve observability, automation, and developer infrastructure; research cutting-edge open-source alternatives and implement them.
  • Operate and advance the observability stack (Prometheus, Grafana, Forgejo).
  • Engineer the platform by designing observability as a platform capability; define SLOs and create actionable alerting to stop incidents before they start.
  • Secure the stack by embedding security engineering into the delivery process and identifying vulnerabilities before pen tests.

Job description

Remote in our day-to-day work. On-site when it matters.
We work remote by default – focused, efficient, and with full ownership. For larger features, architectural decisions, and real brainstorming sessions, we come together in Berlin or Cologne – fast, hands-on, and without unnecessary meeting overhead.

We use AI to accelerate – not to replace thinking.
We design the system, steer the output, and take responsibility for what we ship.
Fast where it makes sense. Careful where it matters.

Your Mission

Take full ownership of smartclip’s internal utility and platform tooling. Focus your energy on the intersection of observability, automation, and developer infrastructure. Don't just maintain existing systems – evolve them, research cutting-edge open-source alternatives, and implement them.

Forget expensive enterprise SaaS. Invest in deep in-house expertise. Understand our systems end-to-end, maintain total flexibility, and contribute back to the open-source ecosystem we depend on.

Face these challenges:

  • Build & Evolve: Operate and advance our observability stack (including Prometheus, Grafana, and Forgejo).

  • Go Open Source First: Replace "buy" decisions with robust "build & maintain" strategies.

  • Engineer the Platform: Design observability as a platform capability. Define SLOs and create actionable alerting to stop incidents before they start.

  • Secure the Stack: Embed security engineering into the delivery process. Find vulnerabilities before the pen tests do.

  • Master the Infrastructure: Navigate Linux systems and distributed tooling. Balance bold exploration with production stability.

Your Skills

Be motivated by systems thinking and deep technical curiosity. Stop being a consumer – start being a builder.

Must-haves:

  • Apply an Observability Mindset: Implement a clear strategy for metrics, logs, and traces. Transform "noisy alerts" into "actionable insights."

  • Embrace Ownership: Live the "you build it, you run it" philosophy. Stop the ticket ping-pong and end the excuses.

Nice-to-haves:

  • Bring experience with GKE or EKS and Jenkins, Ansible, or Terraform.

  • Design and evolve production-grade setups on GCP or AWS.

  • Show us your contributions to open-source projects.

  • Turn your passion for root-cause analysis into blameless post-mortems.

Why you’ll love working with us

  • Ownership over tickets: You’re trusted with real responsibility, not just tasks. No unnecessary bureaucracy, no micromanagement – we rely on you to take things forward.

  • Build > Talk: We test what works – not what sounds good. Fail fast, learn faster.

  • High standards, low ego: We take our work seriously, but not ourselves. Direct feedback, honest collaboration, no drama.

  • Stay sharp: Hackathons, conferences, community – we invest in your growth and keep you at the cutting edge.

  • Remote flexibility. In person, when it matters.: You work flexibly remote, with a connection to our Berlin or Cologne locations, where our TV Labs are and we experiment, build, and learn together.

  • And yes – the fundamentals are covered too: 30 days of vacation + Dec 24 & 31 off, Smart Fridays (4 days week possible), mobility (Germany ticket & JobRad), sports & health offerings, mental health support, corporate benefits, RTL+ access, and more.

Your CV is just the starting point.

What matters more to us than your resume: a portfolio, a side project, a demo repo – anything that shows you don’t just talk about code, you ship it. Production-ready. Thought through. Done.

smartclip is committed to creating a diverse and inclusive environment. All qualified applicants will receive consideration for employment without regard to race, ethnicity, nationality, age, gender, gender identity, religion, sexual orientation, disability, or any other diverse characteristics.

Site Reliability Engineer (SRE) Related jobs

Other jobs at smartclip

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.