Logo for Mastercard

Site Reliability Engineer (Automation & virtualization)

Roles & Responsibilities

  • 5+ years in SRE, DevOps, or Platform Engineering roles
  • Strong scripting in PowerCLI, Python, or Go
  • Experience with VMware ESXi, vCenter, NSX, and UCS Manager
  • Proficiency in Terraform, Ansible, and CI/CD pipeline tools

Requirements:

  • Hypervisor infrastructure management: Deploy, configure, and patch ESXi hosts using VMware Update Manager, iDRAC, and UCS Central; validate host readiness and enforce consistency across environments
  • Automation and IaC: Build and maintain automation pipelines using PowerCLI, Python, Terraform, and Ansible; develop Infrastructure-as-Code templates for scalable provisioning
  • NSX network integration: Administer NSX-T/V for logical switching, routing, and micro-segmentation; troubleshoot endpoint tagging and network performance issues between NSX and ESXi
  • Monitoring and observability: Implement observability stacks using Prometheus, Grafana, Splunk, and Dynatrace; define and track SLOs, SLIs, and error budgets

Job description

Our Purpose

Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.

Title and Summary

Site Reliability Engineer (Automation & virtualization)

Site Reliability Engineer

About the Role
We’re looking for a passionate and skilled Site Reliability Engineer (SRE) to join our Platform Engineering team. This role is pivotal in automating and managing VMware ESXi hypervisors across Dell and Cisco UCS platforms, ensuring high reliability, scalability, and performance of our infrastructure.

You’ll work at the intersection of infrastructure and software, driving automation, observability, and operational excellence across our virtualization stack.

---

Key Responsibilities

Hypervisor & Infrastructure Management
- Deploy, configure, and patch ESXi hosts using tools like VMware Update Manager, iDRAC, and UCS Central.
- Validate host readiness and enforce consistency across environments.

Automation & Infrastructure as Code
- Build and maintain automation pipelines using PowerCLI, Python, Terraform, and Ansible.
- Develop Infrastructure-as-Code (IaC) templates for scalable provisioning.

NSX & Network Integration
- Administer NSX-T/V for logical switching, routing, and micro-segmentation.
- Troubleshoot endpoint tagging and network performance issues between NSX and ESXi.

Monitoring & Observability
- Implement observability stacks using Prometheus, Grafana, Splunk, and Dynatrace.
- Define and track SLOs, SLIs, and error budgets.

Security & Compliance

Planning & Optimization
- Lead modernization efforts including UCS blade decommissioning and Dell R760 upgrades.
- Optimize cluster and VM sizing for performance and cost efficiency.

Collaboration & Stakeholder Engagement
- Partner with application, storage, and network teams to align infrastructure with workload needs.
- Communicate upgrade plans and maintenance schedules across teams.

Documentation & Knowledge Sharing
- Maintain build guides, validation checklists, and operational runbooks.
- Contribute to internal wikis and onboarding materials.


Required Skills
- 5+ years in SRE, DevOps, or Platform Engineering roles.
- Strong scripting in PowerCLI, Python, or Go.
- Experience with VMware ESXi, vCenter, NSX, and UCS Manager.
- Proficiency in Terraform, Ansible, and CI/CD pipeline tools.
- Familiarity with observability platforms and incident response workflows.


Preferred Qualifications
- Experience with REST API integration for ESXi and vCenter.
- Knowledge of GitOps, AIOps, and chaos engineering practices.
- Certifications: VMware VCP, CKA/CKAD, or equivalent.

Corporate Security Responsibility


All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

  • Abide by Mastercard’s security policies and practices;

  • Ensure the confidentiality and integrity of the information being accessed;

  • Report any suspected information security violation or breach, and

  • Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.




Site Reliability Engineer (SRE) Related jobs

Other jobs at Mastercard

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.