Site Reliability Engineer, SaaS

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

At least 3 years of experience in SaaS or cloud service operations., Proficiency with Azure IaaS and PaaS solutions., Experience with infrastructure automation, monitoring tools, and CI/CD practices., Strong problem-solving skills and system programming knowledge in languages like Python, PowerShell, Bash, or Go..

Key responsibilities:

  • Design, implement, and maintain scalable cloud infrastructure solutions.
  • Automate deployment and maintenance of the SaaS platform to ensure reliability and security.
  • Develop monitoring and alerting systems for production environments.
  • Participate in incident response and uphold security and compliance standards.

Kasten by Veeam | #1 Kubernetes Backup logo
Kasten by Veeam | #1 Kubernetes Backup Startup https://www.kasten.io/
11 - 50 Employees
See all jobs

Job description

Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world’s biggest brands. The future of data resilience is here - go fearlessly forward with us.

***Preferred working hours EST/CT***

We are looking for an experienced Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering team. You will be working with a global team to build the world’s next modern data protection platform for Veeam. This is an excellent opportunity for someone with SaaS experience to work with a cutting-edge technology stack based on containers, serverless infrastructure, Golang, public cloud services in the SaaS domain.

Your tasks will include: 
  • Design, implementation and maintenance of scalable and reliable infrastructure solutions on Microsoft Azure and additional cloud platforms in the future 
  • Automation of the deployments, maintenance of a resilient, secure, and efficient SaaS application platform to meet established service levels 
  • Upkeep and support of delivery and release pipelines 
  • Continuous evaluation and improvement of the reliability, performance, and scalability of our systems
  • Development of comprehensive monitoring and alerting solutions
  • Incident response for distributed applications in production environments, including a mandatory participation in on-call rotations
  • Proactively meet standards for information security and compliance, such as ISO (International Standards Organization), SOX (Sarbanes Oxley), SSAE (Standards for Attestation Engagements) 16, etc. 
  • Shepherd the definition, documentation, and improvement of our internal standards for style and maintainability  
Technologies we work with:   
  • Microsoft TFS, Azure DevOps, Git, BitBucket
  • Azure (Entra ID, API Management, Cosmos Db, Storage services, Azure Functions, static website hosting, Azure security, etc.) 
  • IaC tools (Azure ARM templates, AWS CloudFormation, Terraform, the Serverless Framework, etc.) 
  • Observability (Azure Monitor, AppInsights, Elastic Stack) 
What we expect from you: 
  • 3+ years of experience in 24x7 production operations for a SaaS (Software as a Service) or cloud service provider 
  • Experience with implementation and maintenance of leading infrastructure and application monitoring tools (Azure Monitor, AppInsights, Elastic Cloud) 
  • Experience managing Azure IaaS (Infrastructure as a Service) and PaaS (Platform as a Service) solutions 
  • Strong problem-solving skills and the ability to troubleshoot complex issues in a distributed, multi-tenant environments
  • Experience with container orchestration and management platforms 
  • Possess system programming skills in Python, PowerShell, Bash, Go, etc. 
  • Experience with implementation, maintenance, and support of CI/CD practices and tools (Azure DevOps or similar) 
  • You are experienced with distributed, event-based messaging architectures (Azure Event Hub, Azure Service Bus, Kafka, etc.) 
  • English proficiency level sufficient to communicate with international teams  
Will be an advantage: 
  • Industry-recognized certifications in the relevant field (e.g. AZ-400, AWS Certified DevOps Engineer, DCA) 
  • Experience with migrating and adapting on-premises products to cloud infrastructure 
  • Experience with AWS (ECS, RDS, DynamoDb, VPCs, Step Functions, Lambda, IAM, EC2, S3, etc.) 
  • Experience with C# and .NET  
We offer: 
  • Unlimited PTO
  • Medical, dental, and vision benefits that start on day one
  • Flexible spending accounts
  • Life insurance and short-term and long-term disability coverage
  • Family planning support benefits, along with 100% paid maternity and parental leave
  • 401k match
  • Veeam Care Days – additional 24 hours for your volunteering activities
  • Professional training and education, including courses and workshops, internal meetups, and unlimited access to our online learning platforms (Percipio, Athena, O’Reilly) and mentoring through our MentorLab program. 

Please Note: If the applicant is permanently located outside of the United States Veeam reserves the right to decline the application for the position. Remote work is only possible for employees located in the United States.

#LI-JC2

#LI-REMOTE

The salary range posted is On Target Earnings (OTE), which is inclusive of base and variable pay. When making an offer of employment, Veeam will take into consideration the candidate’s expectations, experience, education, scope of responsibility for the role, and the current market demands.

United States of America Pay Range
$136,500$195,000 USD

Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.

Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.  

The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. 

By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Problem Solving

Site Reliability Engineer (SRE) Related jobs