Match score not available

Site Reliability Engineer - Storage - Regular/Remote

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Professional experience in Site Reliability, Development or Systems Engineering, Familiarity with benchmarking tools like FIO, Experience in automation using Bash/Python, Knowledge of automation tools such as Terraform and Ansible, Troubleshooting experience with Linux systems.

Key responsabilities:

  • Architecting highly available storage systems
  • Automating workflows with Bash/Python and Saltstack/Ansible
  • Engaging with Ceph developers and users
  • Identifying and improving bottlenecks in performance
  • Researching next-gen hardware with engineering teams
Akamai Technologies logo
Akamai Technologies Computer Software / SaaS XLarge https://www.akamai.com/
5001 - 10000 Employees
See more Akamai Technologies offers

Job description

Job Description

Do you enjoy collaborating with teams to solve complex challenges?

Do you have a passion for cutting edge technologies and tackling system problems?

Join our highly skilled Storage team

We design, deploy, and manage applications and infrastructure that supports Akamai's internal and customer-facing cloud storage platforms. We do this while maintaining Akamai's mission to make life better for billions of people every day.

Partner with the best

You'll collaborate with operations and development teams to build and manage our scalable storage platforms. You'll create tooling to automate the lifecycle of petabyte-scale storage systems. You'll work with open-source technologies, including Ceph, to ensure Akamai's storage systems are reliable, available, and performant.

As a Senior Site Reliability Engineer, you will be responsible for-

  • Architecting new highly available storage systems and infrastructure, supporting a variety of workloads from compute customers
  • Automating complex workflows and new deployments with Bash/Python and Saltstack/Ansible, increasing the reliability of our storage platforms
  • Engaging and networking with Ceph developers and users, contributing back to the open-source community
  • Identifying bottlenecks within the OSI model, improving performance and reliability wherever possible in software and hardware
  • Tuning Ceph, the Linux kernel, and server hardware, maximizing performance for our customers
  • Working closely with our hardware engineering teams to research, benchmark, and validate next-generation hardware builds

Do What You Love

To be successful in this role you will-

  • Have professional experience in a Site Reliability, Development, or Systems Engineering role, with large scale distributed systems
  • Be familiar with benchmarking tools like FIO, and concepts like IOPS, throughput, 99th percentile and tail latency
  • Have experience in automation using Bash/Python
  • Have experience with automation tools such as Terraform, Ansible, Jenkins, or Salt Stack
  • Have experience troubleshooting Linux systems with tools like tcpdump, iostat, strace, iftop, netstat, and iotop
  • Have experience with designing, deploying, and running mission-critical Linux servers at scale

Work in a way that works for you

FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase, gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.

Learn what makes Akamai a great place to work

Connect with us on social and see what life at Akamai is like!

We power and protect life online, by solving the toughest challenges, together.

At Akamai, we're curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, you'll thrive here.

Working for you

At Akamai, we will provide you with opportunities to grow, flourish, and achieve great things. Our benefit options are designed to meet your individual needs for today and in the future. We provide benefits surrounding all aspects of your life-

  • Your health
  • Your finances
  • Your family
  • Your time at work
  • Your time pursuing other endeavors

Our benefit plan options are designed to meet your individual needs and budget, both today and in the future.

About Us

Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.

Join us

Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Computer Software / SaaS
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs