Key Facts

Remote From:

Germany

Category: Site Reliability Engineer (SRE)

Full time

Senior (5-10 years)

German, English

Hard Skills

Site Reliability Engineering gRPC Python (Programming Language) Go (Programming Language) CI/CD Terraform Incident Management Continuous Monitoring Internet Of Things (IoT) Climate Engineering

Other Skills

•
Communication
•
Problem Solving

Roles & Responsibilities

6+ years in SRE, DevOps, or Platform Engineering
Strong understanding of Site Reliability Engineering principles
Proficiency in programming/scripting languages such as Python, GoLang or TypeScript
Practical understanding of integrating LLMs into automated workflows

Requirements:

Implement and improve monitoring, alerting, and incident response systems
Design, build, and maintain resilient, scalable infrastructure
Attend post-incident reviews, detect patterns and contribute to continuous improvement
Execute performance testing, analyze system bottlenecks, and formulate strategies for capacity planning

1KOMMA5°

Sustainable development

About 1KOMMA5°

1KOMMA5° wurde im Juli 2021 mit dem Ziel gegründet, alle Gebäude CO2 neutral zu machen. Dies erfolgt durch den Einsatz von Photovoltaik, Wärmepumpe und Ladeinfrastruktur in Verbindung mit einer intelligenten und individuellen Steuerung aller Systeme basierend auf künstlicher Intelligenz, um den besonders günstigen Preiszonen von Sonne und Wind direkt an der Strombörse zu folgen. Nach drei Jahren hat das Unternehmen bereits 100.000 Kunden bedient und gilt damit laut EUPD Studie seit 2024 als europäischer Marktführer. Technologisches Herzstück ist die Software Plattform “Heartbeat AI”. Sie verbindet Wärmepumpe, Ladesäulen und Strombatterien direkt mit der Strombörse und steuert jedes einzelne Systeme nach individuellen Fahrplänen durch künstliche Intelligenz. So wird individuell immer der jeweils günstigste und sauberste Strom bereitgestellt, ob vom eigenen Dach oder vom Energiemarkt. Zurzeit betreibt 1KOMMA5° über 75 Standorte und Meisterbetriebe mit rund 2.200 Mitarbeitenden von Europa bis Australien. Zusätzlich arbeitet 1KOMMA5° kontinuierlich an der Weiterentwicklung der Heartbeat AI 🤖 Platform hierzu arbeiten allein am Entwicklungsstandort in Berlin über 200 Entwicklerinnen und Entwickler. Bereits 30% der mit Heartbeat AI verbundenen Haushalte erwirtschaften durch den Verkauf von Strom in Hochpreisphasen mehr Erlöse als sie Stromkosten. Insgesamt steuert das System nach kurzer Zeit bereits über 40.000 Energiesysteme.

Company type: Scaleup

Industry: Sustainable development

Founded: 2018

Company size: 1001 - 5000

Website LinkedIn See all jobs →

Job description

1KOMMA5°

At 1KOMMA5°, we pursue a clear vision: Living on wind and sunlight forever for free. To make this a reality, we are building the energy system of the future with Heartbeat AI. Want to be part of it?We bring together regional craftsmanship and scalable software: We don't think of solar, batteries, heat pumps, and e-mobility as isolated components, but control them as an intelligent, integrated overall system in our virtual power plant. Directly connected to the electricity market – in real time, fully automated. This way, energy is used when it is available from renewables and particularly cost-effective. By 2030, our goal is to transition 1.5 million households to renewable energies. Over 3,000 people are working towards this every day, at more than 80 locations worldwide, from Finland to Australia.

Want to take responsibility and build solutions that truly matter? Apply now and help us shape the energy world of tomorrow.

Learn more about our Product & Tech team!

Your mission

1KOMMA5° is building Europe’s largest virtual power plant ("Heartbeat AI"). As a Senior SRE in our Platform team, you will bridge classic infrastructure with Agentic Engineering, specifically focusing on leveraging AI agents to eliminate developer friction, optimize CI/CD pipelines, and automate the resolution of code review and deployment bottlenecks.

Tech Stack

Cloud & Infra: GCP (CloudRun, GKE), Terraform, Terramate
Reliability: Incident.io, Datadog (OpenTelementry)
Agentic: Cursor
CI/CD & DevEx: GitHub Actions, Backstage
Languages: Python, GoLang, TypeScript

Key Responsibilities Include but not limited to

Implement and improve monitoring, alerting, and incident response systems and processes to ensure high reliability for our customers and meet defined SLOs
Design, build, and maintain resilient, scalable infrastructure utilizing SRE principles and best practices
Attend post-incident reviews, detect patterns and contribute to continuous improvement efforts
Execute performance testing, analyze system bottlenecks, and formulate strategies for capacity planning to ensure our systems meet current and future demands effectively
Build systems where CI/CD test failures serve as immediate, real-time context for agents, enabling them to analyze logs, trace dependencies, and suggest or apply instant code fixes.

Your profile

6+ years in SRE, DevOps, or Platform Engineering
Strong understanding and practical application of Site Reliability Engineering (SRE) principles, methodologies, and best practices
Proficiency in programming/scripting languages such as Python, GoLang or TypeScript
Practical understanding of integrating LLMs into automated workflows. You know how to feed live system state (like a fresh CI test failure) into an agent as actionable context.
Prior experience in incident management, post-incident reviews, and implementing improvements to prevent future incidents
Ability to troubleshoot complex technical issues systematically and effectively
Good experience working with a public cloud provider, ideally Google Cloud Platform (GCP), and a solid understanding of its observability services
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
Excellent communication skills to convey technical concepts and collaborate effectively with diverse teams
Very good knowledge of spoken and written english, german is a plus
Residency in Germany

Bonus points for:

Interest in climate tech industry
Prior experience with IoT applications
Having worked in a scale up environment at a company of similar size

Benefits

You are part of an international, dynamic, and highly motivated team of people who have proven to make things happen
With your work, you accelerate the "energy transition" and hence have a direct impact on our climate
Work with and learn from other super-smart colleagues
You will enjoy direct contact with core decision-makers
You will enjoy the best chances of entering full-time in one of Europe’s most thriving scaleups
You work remotely (Germany-wide), with offices in Hamburg, Berlin or Munich
Create a healthy balance alongside your work and enjoy all the benefits of the EGYM Wellpass
Benefits and discounts are yours with Futurebens
Whether city bike or e-bike - be flexible with our job bike leasing and do something good for the environment at the same time