Match score not available

Site Reliability Engineer - Data & AI

EXTRA HOLIDAYS - WORK FROM ANYWHERE - FULLY FLEXIBLE
Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 
Texas (USA), United States

Offer summary

Qualifications:

Bachelor's degree in Computer Science or Engineering, 3+ years'experience as Site Reliability Engineer, Experience with Kafka and Debezium, Proficient in Infrastructure as Code tools, Familiarity with Kubernetes and Docker.

Key responsabilities:

  • Implement self-service data infrastructure solutions for 10+ business units.
  • Manage AWS infrastructure components using Terraform.
  • Enhance CI/CD pipelines for consistent software deployments.
  • Collaborate with teams to understand and implement solutions.
  • Document architecture, processes, and support continuous improvement.
Kraken logo
Kraken Fintech: Finance + Technology SME https://kraken.com/
1001 - 5000 Employees
See more Kraken offers

Job description

Logo Jobgether

Your missions

Building the Future of Crypto 

Our Krakenites are a world-class team with crypto conviction, united by our desire to discover and unlock the potential of crypto and blockchain technology.

What makes us different?

Kraken is a mission-focused company rooted in crypto values. As a Krakenite, you’ll join us on our mission to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. For over a decade, Kraken’s focus on our mission and crypto ethos has attracted many of the most talented crypto experts in the world.

Before you apply, please read the Kraken Culture page to learn more about our internal culture, values, and mission. We also expect candidates to familiarize themselves with the Kraken app. Learn how to create a Kraken account here.

As a fully remote company, we have Krakenites in 70+ countries who speak over 50 languages. Krakenites are industry pioneers who develop premium crypto products for experienced traders, institutions, and newcomers to the space. Kraken is committed to industry-leading security, crypto education, and world-class client support through our products like Kraken ProKraken NFT, and Kraken Futures.

Become a Krakenite and build the future of crypto!

Proof of work

The team

Join our Data Infrastructure team and play a pivotal role in upholding the reliability, scalability, and efficiency of our robust Data platform. As a Senior Site Reliability Engineer (SRE) specialized in Data Infrastructure, you will collaborate closely with diverse cross-functional teams to conceive, execute, and oversee the foundational data infrastructure that empowers our array of applications and services.

As a key member of our Data Infrastructure team, you will be at the forefront of ensuring the unfaltering availability and performance of our platform. Your profound proficiency in cloud technologies, infrastructure as code, automation, monitoring/alerting, logging, user and machine AuthNZ, and certificate management will be instrumental in upholding the exceptional operational standards we set for our services.

This role is destined to candidates based in the Americas.

The opportunity
  • Implement data infrastructure solutions (self service) that support the needs of 10+ business units and over 100 engineering and data analysts

  • Utilize Infrastructure as Code (IaC) principles to design, provision, and manage both on-premises and cloud (AWS) infrastructure components using tools such as Terraform

  • Develop and maintain automation scripts using bash/shell scripting and to automate operational tasks and deployments.

  • Enhance and manage CI/CD pipelines to facilitate consistent software deployments across the data infrastructure.

  • Implement robust data monitoring and alerting solutions to proactively detect anomalies and performance issues.

  • Manage and implement role-based access control (RBAC) and permissions for a multitude of user groups and machine workflows across different environments

  • Manage and maintain real-time streaming data architecture using technologies like Kafka and Debezium Change Data Capture (CDC).

  • Ensure the timely and accurate processing of streaming data, enabling data analysts and engineers to gain insights from up-to-date information.

  • Utilize Kubernetes to manage containerized applications within the data infrastructure, ensuring efficient deployment, scaling, and orchestration.

  • Implement effective incident response procedures and participate in on-call rotations.

  • Collaborate with data analysts, engineers, and cross-functional teams to understand requirements and implement appropriate solutions.

  • Document architecture, processes, and best practices to enable knowledge sharing and support continuous improvement.

  • Support AI/ML teams with their infra requests

Skills you should HODL
  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).

  • Proven experience (3+ years) working as a Site Reliability Engineer, Infrastructure Engineer, or similar roles, with a focus on data infrastructure and security.

  • Experience with real-time data processing technologies, such as Kafka and Debezium

  • Working experience in managing hybrid systems particularly AWS and (HashiCorp nice to have).

  • Infrastructure as Code tools such as Terraform, and Atlantis ( Terragrunt is plus to have).

  • Experience with containerization and orchestration tools, particularly Kubernetes and Docker

  • Solid understanding of bash/shell scripting and proficiency in at least one programming language (preferably Python or Go).

  • Strong problem-solving skills and the ability to troubleshoot complex systems.

Nice to haves
  • Familiarity with CI/CD deployment pipelines and related tools.

  • Knowledge of HashiCorp products like Vault, Nomad, and Consul.

  • Experience with data-related technologies (databases, airflow, data warehousing, data lakes)

#LI-Remote #LI-ZW1 #USCANBR

This job is accepting ongoing applications and there is no application deadline.

Please note, applicants are permitted to redact or remove information on their resume that identifies age, date of birth, or dates of attendance at or graduation from an educational institution.

We consider qualified applicants with criminal histories for employment on our team, assessing candidates in a manner consistent with the requirements of the San Francisco Fair Chance Ordinance.

Kraken is powered by people from around the world and we celebrate all Krakenites for their diverse talents, backgrounds, contributions and unique perspectives. We hire strictly based on merit, meaning we seek out the candidates with the right abilities, knowledge, and skills considered the most suitable for the job. We encourage you to apply for roles where you don't fully meet the listed requirements, especially if you're passionate or knowledgable about crypto!

As an equal opportunity employer, we don’t tolerate discrimination or harassment of any kind. Whether that’s based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws. 

Stay in the know

Follow us on Twitter

Learn on the Kraken Blog

Connect on LinkedIn

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Industry :
Fintech: Finance + Technology
Spoken language(s):
Check out the description to know which languages are mandatory.

Soft Skills

  • Problem Solving
  • collaboration

Data Engineer Related jobs