Offer summary

Qualifications:

5 years of experience as SRE, Experience with multi-region deployments, Deep understanding of Kubernetes components, Administered distributed systems, Linux servers, Knowledge in Python automation scripts.

Key responsibilities:

Build and maintain scalable SOC platform

Improve performance and availability issues

Automate lifecycle of microservices on Kubernetes

Resolve production incidents in on-call rotation

Ensure SLA compliance through observability

Job description

We are currently looking for a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team of five other SREs.

If you are a curious problem-solver with a solid understanding of cloud technologies, Linux, and Kubernetes, you’ll fit right in !

📍 The position is available in Rennes, Paris or fully remote.

Your missions :

Build, and maintain a scalable and highly available SOC platform over 4 regions,
Improve performance or availability issues using your expertise,
Automate the lifecycle of dozens of microservices on Kubernetes,
Automate backups and restorations for our databases,
Resolve production incidents in a 2-level on-call rotation,
Ensure we meet our SLA by providing observability and resilience at all levels.

Our technical stack :

Kubernetes : k3s, Traefik, Cilium, Ceph, ArgoCD, Helm, Rancher
Observability : Thanos, Prometheus, Grafana, Loki
Tools : Python, Ansible, SaltStack, Terraform
Databases : Elasticsearch (> 300 nodes), Kafka (> 3M rps), Clickhouse (> 10 TB), Redis, KeyDB, PostgreSQL, ArangoDB
CI/CD : GitHub Actions, Harbor
Cloud providers : OVH, Akamai, Azure, Scaleway

🤩 Your profile :

5 years of experience in a SRE job working on a similar infrastructure,
Experience with handling multi-region deployments with CI/CD solutions such as ArgoCD,
A deep understanding of Kubernetes and its components (CRD, CNI, CSI),
Administered distributed systems and a large number of Linux servers,
Developed several useful tools/automation scripts in Python,
Participated in a 24x7 on-call rotation,
A solid understanding of cloud networking, load balancing, and firewall configurations,
A sense for innovation and implementing changes.

We don’t expect you to know our whole stack from the beginning, we are looking for curious and passionate individuals who like to learn new things while having a positive impact on their collaborators.

💙 Why join Sekoia.io ?

> Our values are simple and effective, and deeply rooted in our work habits: collaboration, benevolence and innovation. Whether within our team daily, or in our customer relations, these values are the source of constant progress and the desire to constantly surpass ourselves!

> 2 offices in Paris & Rennes: just tell us where you'd like to be based!

> Remote working flexibility: 3 days a week of remote work or full remote possibility

> Health insurance 100% covered by Sekoia.io

Required profile