Match score not available

Linux Site Reliability Consultant

Remote: 
Full Remote
Contract: 
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

Experience with Google and AWS Clouds, Scripting using Python and Scala required, Understanding of microservices architecture, Comprehensive systems hardware troubleshooting experience, Familiarity with DevOps tools and culture.

Key responsabilities:

  • Operate and maintain customer infrastructure solutions
  • Plan maintenance activities and document designs
  • Provide Root Cause Analysis for incidents
  • Identify opportunities to improve resiliency
  • Act as a technology leader for clients
Pythian logo
Pythian Information Technology & Services SME https://www.pythian.com/
201 - 500 Employees
See more Pythian offers

Job description

Site Reliability Consultant
Costa Rica | Remote | #LI-Remote | Work from Home

One available position for the following time zone: NZST

Why you?

Do you thrive on solving tough problems—even under pressure? Are you motivated by fast-paced environments with continuous learning opportunities? Do you enjoy collaborating with a team of peers who push you to constantly up your game? At Pythian, we are building a next-generation Site Reliability Engineering team. We need motivated and talented individuals on our teams, and we want you! You’ll act as a technology leader, advisor for our clients, and mentor for other team members.  Projects would include infrastructure architecture, automation, and intelligent monitoring systems from design through implementation. If you Love Your Data and want to Love Your Career, this could be the job for you!

What will you be doing?
  • Operate, maintain, and administer solutions contributing to customer infrastructure's operational efficiency, availability, and visibility.
  • Planning maintenance activity, design documentation, and standard procedures
  • Provide Root Cause Analysis reports for outages/incidents (ITIL - Problem Management)
  • Observe and provide feedback on the current state of the client’s infrastructure, and identify opportunities to improve resiliency, reduce incident occurrence, and automate repetitive administrative and operational tasks.
  • Contribute to, improve, and maintain team documentation about client systems and infrastructure, procedures, policies, and schedules.
  • Gather and document information about client environments through audit activities, and analyze the information to identify opportunities for improvement and application of best practices.
  • Work collaboratively with teammates to contribute to the continuous improvement of our working culture.
  • Act as a technology leader for clients, as well as drive client discussions on technology road maps.
  • Participate in an on-call rotation in an escalation capacity.


  • What do we need from you?
  • Experience working with Google and AWS Clouds (including infrastructure as code deployment with Cloud Formation, Terraform, Opsworks, etc)
  • Scripting and automation of administrative tasks using Python and Scala is a must
  • Solid understanding of microservices architecture and container technologies (Kubernetes is a must, Docker, lxc, etc)
  • Clear understanding of software development lifecycles and best practices from an infrastructure point of view (PRs, merge, rebase, etc)
  • Understanding the end-to-end operations of a ‘Business System’ vs components.
  • Comprehensive systems hardware and network troubleshooting experience
  • Common Linux distribution platform installation, configuration, performance tuning, and cloud migration.
  • TCP/IP networking, NIC bonding, and network services configuration (DNS, NTP, DHCP, SMTP, etc)
  • Operation and administration of virtual infrastructure, including experience with at least one hypervisor (VMware, Hyper-V, KVM, etc.)
  • Ability to describe IaaS, PaaS, SaaS, pros and cons of each, use cases for virtualization and cloud
  • Administration of web servers and supporting technologies, including network load balancers
  • Experience with the design, development, and deployment of Puppet
  • System and application error investigation, troubleshooting of access/availability issues including deep multi-system root cause analysis
  • Experience managing networking devices, such as switches and firewalls from a variety of vendors
  • Solid understanding of DevOps tools, processes, and culture
  • Ability to pick up new technologies quickly
  • Ability to provide accurate work scheduling and task estimations for work delivery


  • What do you get in return?
  • Love your career: Competitive total rewards package. Blog during work hours; take a day off and volunteer for your favorite charity.
  • Love your work/life balance: Flexibly work remotely from your home, there’s no daily travel requirement to an office! All you need is a stable internet connection.
  •  Love your coworkers: Collaborate with some of the best and brightest in the industry!
  • Love your development: Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like! 
  • Love your workspace: We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!  
  • Love yourself: Pythian cares about the health and well-being of our team. You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more). Additionally, you will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.
  • Why Pythian

    Pythian excels at helping businesses use their data and cloud to transform how they compete and win in this ever-changing environment by delivering advanced on-prem, hybrid, cloud and multi-cloud solutions to solve the toughest data challenges faster and better than anyone else. Founded and headquartered in Ottawa, Canada in 1997, Pythian now has more than 300 employees located around the globe with over 350 clients spanning industries from SaaS; media; gaming; financial services; e-commerce and more. Pythian is known for its technology-enabled data expertise covering everything from ETL to ML. We pride ourselves on our ability to deliver innovative solutions that meet the specific data goals of each client and have built meaningful partnerships with major cloud vendors AWS, Google and Microsoft. The powerful combination of our extensive expertise in data and cloud and our ability to keep on top of the latest bleeding edge technologies make us the perfect partner to help mid and large-sized businesses transform to stay ahead in today’s rapidly changing digital economy.

    Disclaimer
    For this job an equivalent combination of education and experience, which results in demonstrated ability to apply skills will also be considered.
    The successful applicant will need to fulfill the requirements necessary to obtain a background check.
    Accommodations are available upon request for candidates taking part in all aspects of the selection process.

    Required profile

    Experience

    Level of experience: Mid-level (2-5 years)
    Industry :
    Information Technology & Services
    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Other Skills

    • Collaboration
    • Task Planning
    • Problem Solving

    Site Reliability Engineer (SRE) Related jobs