Logo for Pango Group

Principal Platform Engineer

Key Facts

Remote From: 
Full time
Expert & Leadership (>10 years)
English

Other Skills

  • Delivery Focused
  • Communication
  • Time Management
  • Problem Solving

Roles & Responsibilities

  • 8-10+ years in DevOps/Platform Engineering with at least 2 years operating and maintaining production ML workloads.
  • Deep, hands-on experience with Google Cloud Platform (GCP) including VPC-SC, IAM, Organization Policies, and GKE (Cluster topology, Helm, Kustomize; ArgoCD).
  • High proficiency with Istio service mesh and API Gateways (Kong).
  • Expert-level Terraform skills with Atlantis/GitOps workflows across a large multi-hundred-file infrastructure.

Requirements:

  • Design, deploy, and maintain elastic scaling cloud infrastructure on GCP and containerization with Kubernetes for high-performance ML workloads.
  • Build automated CI/CD pipelines for training, testing, and deploying ML models using Jenkins, GitHub Actions, or Airflow.
  • Implement model and system observability to monitor drift, accuracy, latency, and performance in production.
  • Foster collaboration across data engineers, ML engineers, Backend and Frontend teams, and empower each team to monitor their workloads; participate in on-call rotation and SOC posture management.

Job description

Point Wild helps customers monitor, manage, and protect against the risks associated with their identities and personal information in a digital world. Backed by WndrCo, Warburg Pincus and General Catalyst, Point Wild is dedicated to creating the world’s most comprehensive portfolio of industry-leading cybersecurity solutions. Our vision is to become THE go-to resource for every cyber protection need individuals may face - today and in the future. 

Join us for the ride!

We’re looking for a Principal Platform Engineer to architect and lead the infrastructure strategy for our next-generation Production ML platform on Google Cloud. In this role, you will be the backbone of our high-performance machine learning workloads, ensuring our systems are elastic, secure, and resilient. You won’t just maintain the status quo; you’ll build the "paved road" for our engineers, automating everything from model deployment to complex networking perimeters. We are a high-trust, outcome-focused team that moves quickly to solve some of the most challenging problems in the ML space.

Core Responsibilities:

  • Infrastructure Management: Design, deploy, and maintain elastic scaling cloud infrastructure (GCP) and containerization tools like Kubernetes for high-performance ML workloads.
  • CI/CD Pipeline Development and maintenance: Build automated pipelines for training, testing, and deploying machine learning models using tools like Jenkins, GitHub Actions, or Airflow.
  • Model Monitoring & Maintenance: Implement observability tools to track model drift, accuracy, latency, and performance degradation in production.
  • Collaboration: Bridge the gap between data engineers, ML engineers, Backend and Frontend engineers to ensure smooth production operation.
  • ML Observability: Implement comprehensive monitoring for system health (latency/uptime) alongside ML-specific metrics, such as feature drift, prediction accuracy, and data distribution shifts, to ensure long-term model reliability. Non ML workload and production metrics monitoring.
  • Deploy tools that empower individual teams to monitor their workloads.
  • Participate in on-call rotation, help manage posture to ensure compliance with standards such as SOC.

What you bring to the table:

  • Senior Expertise: 8 - 10+ years in DevOps/Platform Engineering, with at least 2 years of experience specifically operating and maintaining production ML workloads.
  • GCP & K8s Mastery: Deep, hands-on experience with GCP (VPC-SC, IAM, Organization Policies) and GKE (Cluster topology, Helm, Kustomize, and in-cluster operators like ArgoCD).
  • Service Mesh Excellence: High proficiency with Istio (VirtualServices, mTLS, sidecar injection) and API Gateways (specifically Kong).
  • Infrastructure as Code: Expert-level Terraform skills, specifically using an Atlantis/GitOps workflow across a massive, multi-hundred-file estate.
  • Secrets & Identity: Experience managing enterprise-grade identity and secrets (Auth0, Dex, ESO, or SOPS).
  • Data/ML Tooling: Experience operating Airflow in production and an ML-serving stack (e.g., Triton, vLLM, MLflow).
  • Database Management: Comfortable managing Cloud SQL (PostgreSQL), BigQuery, and in-cluster datastores like Elasticsearch or ClickHouse.
  • At least an upper-intermediate level of spoken and written English.

It would be great if you also had:

  • ML Observability: Past experience with continuous monitoring of model accuracy and detecting data/concept drift.
  • Automation Savvy: Experience with Ansible for cluster bootstrap and recovery.
  • Advanced Certifications: Kubernetes (CKA/CKS) or GCP Professional Cloud Architect/Security Engineer certifications.
  • Modern Stack Exposure: Familiarity with Loki, Grafana, or managing ClickHouse at scale.

 

As part of Point Wild, you will:

Solve real customer problems. Point Wild’s point solutions allow consumers to address their immediate cyber protection needs. Our mandate is to continuously anticipate our customers’ evolving digital security needs to create best-in-class solutions aimed at keeping them safe.

See your impact. We are a scrappy, nimble organization where individual contributions are needed and valued. You will see your impact every day.

Accelerate your career.  As we expand, you will have the opportunity to learn new technologies, products, and markets in a fast-paced, growth-oriented environment.

Most importantly, you’ll get to work with other talented people at a company where people matter. If you want to put your fingerprint on an organization and leapfrog your growth, this is the place for you.

In keeping with our beliefs and goals, no employee or applicant will face discrimination or harassment based on race, color, ancestry, national origin, religion, age, gender, marital domestic partner status, sexual orientation, gender identity, disability status, or veteran status. Above and beyond discrimination or harassment based on “protected categories,” Point Wild is committed to being an inclusive community where all feel welcome. Whether blatant or hidden, barriers to success have no place at Point Wild.

Important privacy information for United States based job applicants can be found here.

 

Platform Engineer Related jobs

Other jobs at Pango Group

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.