Logo for BigBear.ai

Principal K8 Platform Engineer

Roles & Responsibilities

  • 7+ years in infrastructure/platform engineering with 4+ years hands-on Kubernetes administration in production and TS/SCI clearance.
  • Proven ability to operate Kubernetes across multiple clouds (AKS, EKS, and/or GKE), with Azure Government experience strongly preferred.
  • Deep knowledge of Kubernetes internals and subsystems (scheduling, CNI networking, DNS, ingress, CSI storage, RBAC, admission control, upgrades) and a strong security hardening background (policy controls, least privilege, network policies, secrets, supply chain security).
  • Proficiency with Infrastructure-as-Code and automation (Terraform, Ansible) plus scripting (Bash, Python, Go); experience with observability tooling, incident response, and producing audit-ready evidence for FedRAMP/DoD environments; relevant certifications (CKA/CKS, Azure/AWS Architect, Security+, CISSP) preferred.

Requirements:

  • Own day-to-day and strategic administration of Kubernetes clusters across AKS/EKS/GKE (and Azure Government enclaves) to ensure high availability and upgrade-safe operations.
  • Design, build, secure, and operate multi-cloud Kubernetes platform architectures with standards for namespaces, RBAC, Pod Security, admission control, network segmentation, and workload isolation.
  • Implement and maintain end-to-end platform security controls (image provenance, vulnerability management, runtime protection, secrets management, certificate lifecycle) and mature GitOps/CI/CD patterns (Flux/Argo) with strong auditability.
  • Define observability for clusters and workloads (logging, metrics, traces, alerting, SLOs/SLIs), drive root-cause analysis, coordinate incident response, manage backups/redundancy for etcd and storage, maintain FedRAMP/DoD compliance, and provide 2nd/3rd line support as needed.

Job description

Overview:

Ask Sage (BigBearAI Company) is the leading Generative AI platform that augments the velocity of government and commercial teams with dozens of use cases from coding to cybersecurity to acquisition to data analysis and much more. Our FedRAMP High and DoD IL5 accredited cutting-edge technology enables teams to focus on strategic initiatives while we take care of the heavy lifting. We are seeking a highly skilled and experienced Principal Kubernetes Platform Engineer.This critical role involves privileged access to our cloud instances, Kubernetes clusters, and supporting platform services, including environments operating under FedRAMP High and Department of Defense requirements. The Principal Kubernetes Platform Engineer (Multi-Cloud) will be accountable for the reliability, security, scalability, and operational excellence of our Kubernetes estate across Azure Government (AKS), AWS (EKS), and Google Cloud (GKE) as needed.You will serve as the organization’s technical authority for Kubernetes administration and platform engineering, setting standards for cluster architecture, network policy, identity and access, workload isolation, secrets management, observability, release engineering, and incident response. The ideal candidate combines deep hands-on Kubernetes expertise with disciplined operational execution, strong security instincts, and the ability to automate everything (GitOps/Infrastructure-as-Code) while partnering effectively with security, engineering, and leadership. As a key member of our team, you will improve platform uptime, reduce operational toil, accelerate delivery, and ensure our container platform remains compliant and defensible under audit.

What you will do:

Key Responsibilities

  • Own day-to-day and strategic administration of Kubernetes clusters across multiple cloud environments (AKS/EKS/GKE), including Azure Government enclaves where applicable. 
  • Design, build, secure, and operate highly available Kubernetes platform architectures (multi-zone, upgrade-safe, disaster recovery-ready). 
  • Establish and enforce cluster standards: namespaces/tenancy, RBAC, Pod Security standards, admission control, network segmentation, and workload isolation. 
  • Implement and maintain end-to-end platform security controls: image provenance, vulnerability management, runtime protection, secrets management, and certificate lifecycle. 
  • Build and mature GitOps/CI/CD patterns for Kubernetes (e.g., Flux/Argo), ensuring reliable, repeatable deployments with strong auditability. 
  • Manage Kubernetes lifecycle operations: version upgrades, node pool strategy, capacity planning, add-on management, and cluster hardening. 
  • Define and operate observability for clusters and workloads: logging, metrics, traces, alerting, SLOs/SLIs, and actionable runbooks. 
  • Proactively ensure the highest levels of platform availability and performance; lead root-cause analysis and drive permanent corrective actions. 
  • Maintain security, backup, and redundancy strategies for etcd (where applicable), persistent storage, cluster state, and critical platform services. 
  • Secure and maintain the stack to fix cybersecurity vulnerabilities, CVEs, misconfigurations, and supply-chain risks; coordinate remediation timelines with stakeholders. 
  • Provide 2nd and 3rd level support for Kubernetes and containerized workloads, including incident response participation and on-call support as required. 
  • Partner with application teams to set best practices for containerization, resource requests/limits, health probes, service discovery, ingress, and release safety. 
  • Develop and maintain automation to reduce manual intervention (IaC, policy-as-code, auto-remediation, self-service workflows, and automated compliance evidence). 
  • Liaise with cloud vendors and internal stakeholders for platform problem resolution and architectural guidance. 
  • Maintain our environment to comply with FedRAMP High requirements and support regular reporting and audit evidence collection. 
  • Uphold and enforce Ask Sage’s compliance, privacy, and security policies, ensuring adherence to all relevant regulations and standards. 
  • Conduct regular audits of Kubernetes configurations and platform controls; recommend and implement enhancements aligned to benchmarks and risk posture.
What you need to have:
  • Minimum of 7 years of experience in infrastructure/platform engineering, including at least 4 years of deep, hands-on Kubernetes administration in production. 
  • Clearance: TS/SCI required
  • Demonstrated expertise operating Kubernetes across multiple cloud providers (AKS + EKS and/or GKE). 
  • Strong knowledge of Kubernetes internals and critical subsystems: scheduling, networking (CNI), DNS, ingress, storage (CSI), RBAC, admission control, and upgrades. 
  • Strong security background in container and Kubernetes hardening (e.g., policy controls, least privilege, network policies, secrets handling, supply chain security). 
  • Proficiency with Infrastructure-as-Code and automation (e.g., Terraform, Ansible) and scripting (e.g., Bash, Python, Go). 
  • Experience with observability tooling and operational maturity (monitoring/alerting, incident response, SLOs). 
  • Familiarity with compliance-driven environments and producing audit-ready evidence (FedRAMP/DoD environments a plus). 
  • Relevant certifications preferred (one or more): CKA/CKS, Azure Solutions Architect, AWS Solutions Architect, Security+, CISSP. 
What we'd like you to have:
  • Demonstrated expertise operating Kubernetes across multiple cloud providers (AKS + EKS and/or GKE); Azure Government experience strongly preferred.
About BigBear.ai:

BigBear.ai is a leading provider of AI-powered decision intelligence solutions for national security, supply chain management, and digital identity. Customers and partners rely on Bigbear.ai’s predictive analytics capabilities in highly complex, distributed, mission-based operating environments. Headquartered in McLean, Virginia, BigBear.ai is a public company traded on the NYSE under the symbol BBAI. For more information, visit https://bigbear.ai/ and follow BigBear.ai on LinkedIn: @BigBear.ai and X: @BigBearai.

 

BigBear.ai is an Equal opportunity employer all protected groups, including protected veterans and individuals with disabilities.

 

Platform Engineer Related jobs

Other jobs at BigBear.ai

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.