Infrastructure Architect (GCP)

Work set-up: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Bachelor's or Master's degree in Computer Science or related field., 8–15 years of enterprise infrastructure architecture experience., Proven expertise in designing and deploying AI/ML or GenAI infrastructure., Deep knowledge of cloud platforms (preferably GCP), on-prem virtualization, storage, networking, and container orchestration..

Key responsibilities:

  • Design and implement hybrid infrastructure solutions across on-prem and cloud environments.
  • Manage provisioning, automation, and orchestration of virtual machines, containers, and Kubernetes clusters.
  • Collaborate with AI/ML engineers, data scientists, and DevOps teams to deploy and scale AI models and GenAI agents.
  • Optimize infrastructure performance and cost-efficiency for AI workloads.

Quantiphi logo
Quantiphi Information Technology & Services Large https://www.quantiphi.com/
1001 - 5000 Employees
See all jobs

Job description

While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.


If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!

About Quantiphi:
Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed.

Company Highlights:
Quantiphi has seen 2.5x growth YoY since its inception in 2013, we don’t just innovate - we lead. Headquartered in Boston, with 4,000+ Quantiphi professionals across the globe. As an Elite/Premier Partner for Google Cloud, AWS, NVIDIA, Snowflake, and others, we’ve been recognized with:

  • 17x Google Cloud Partner of the Year awards in the last 8 years.
  • 3x AWS AI/ML award wins.
  • 3x NVIDIA Partner of the Year titles.
  • 2x Snowflake Partner of the Year awards.
  • We have also garnered top analyst recognitions from Gartner, ISG, and Everest Group.
  • We offer first-in-class industry solutions across Healthcare, Financial Services, Consumer Goods, Manufacturing, and more, powered by cutting-edge Generative AI and Agentic AI accelerators.
  • We have been certified as a Great Place to Work for the third year in a row- 2021, 2022, 2023.

Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation. Your next big opportunity starts here!

For more details, visitWebsite or LinkedIn Page.

Work Location: Dallas (preferred) but anywhere in US works.

Role Overview:

  • We are seeking a seasoned Infrastructure Architect with deep expertise in both cloud platforms and on-premise infrastructure to design, implement, and manage robust hybrid environments that can support high-compute AI and GenAI workloads.
  • You will work onsite with one of our key enterprise clients to assess existing infrastructure, define scalable architectures, and ensure optimal performance for AI/ML and GenAI solutions.
  • You’ll play a critical role in bridging infrastructure, DevOps, and AI solution delivery, ensuring our client has the right foundational stack to scale advanced AI workloads across their enterprise.

Key Responsibilities:

Hybrid Infrastructure Design & Deployment:

  • Architect and implement secure, scalable, and cost-effective infrastructure solutions across on-prem and cloud (GCP, AWS, Azure) environments.
  • Evaluate existing systems and define migration or integration strategies for deploying AI/GenAI workloads in hybrid setups.
  • Design infrastructure supporting GPU-intensive workloads, distributed training, inferencing, and vector database storage.

Cloud & On-Prem Operations:

  • Manage provisioning, automation, and orchestration across virtual machines, containers, and Kubernetes clusters.
  • Implement and monitor high-availability, low-latency, and disaster recovery strategies.
  • Optimize infrastructure for latency-sensitive applications, including real-time GenAI agentic workflows.

Collaboration & Enablement:

  • Work closely with AI/ML engineers, data scientists, solution architects, and DevOps to ensure smooth deployment and scaling of models and GenAI agents.
  • Recommend best practices on hybrid infrastructure for LLM fine-tuning, RAG architecture, and multi-agent orchestration platforms.
  • Guide teams on infrastructure security, IAM policies, and governance frameworks for GenAI applications.

Performance & Cost Optimization:

  • Continuously benchmark, profile, and optimize infrastructure for performance and efficiency.
  • Monitor resource utilization and propose capacity planning strategies for AI workload peaks.

Key Qualifications & Experience:

  • Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.
  • 8–15 years of experience in enterprise infrastructure architecture, with significant experience in both on-prem and cloud-native environments.
  • Proven track record in designing and deploying AI/ML or GenAI-supporting infrastructure (e.g., GPU clusters, Kubernetes for ML workloads, hybrid vector databases).
  • Deep knowledge of cloud services (GCP preferred; AWS or Azure acceptable), on-prem virtualization, storage, networking, and container orchestration.
  • Experience supporting multi-agentic GenAI frameworks, including task orchestration, distributed agents, and workflow automation.
  • Hands-on experience in DevOps and IaC tools (Terraform, Helm, Ansible, CI/CD).
  • Familiarity with AI governance, data security, and compliance in hybrid environments.

Required Skills:

GCP Infrastructure Design & Deployment
Deep hands-on expertise in architecting and managing solutions on Google Cloud Platform, including:

  • VPC design, subnetting, firewall rules, Private Service Connect, and Cloud Interconnect for secure hybrid networking.
  • Identity & Access Management (IAM), Workload Identity Federation, and service accounts for secure access control across services.
  • Cloud Load Balancing, Cloud NAT, and Cloud Armor for high-availability, secure ingress/egress management.
  • Resource hierarchy and organization policies to manage large-scale enterprise GCP environments.

AI/GenAI-Centric Compute & Storage Architecture
Strong understanding of compute services tailored to GenAI:

  • Compute Engine for custom VM/GPU provisioning (A100/H100, T4).
  • GKE (Google Kubernetes Engine) for containerized model deployments, including support for GPU workloads and node auto-provisioning.
  • Vertex AI and Vertex AI Workbench for managing ML pipelines, training, model registry, and deployments.

Storage architecture experience with:

  • Cloud Storage (standard, nearline, coldline) for unstructured datasets.
  • Filestore, Local SSDs, and Persistent Disks for high-throughput model training and inferencing.
  • Integration with BigQuery and Spanner for structured data workloads supporting GenAI applications.

Containerization, Orchestration & IaC on GCP:

  • Advanced experience with GKE:
  • Cluster autoscaling, workload identity, taints/tolerations for GPU scheduling.
  • Helm-based deployments and integration with Artifact Registry.

Proficient in Infrastructure as Code using:

  • Terraform (with GCP provider modules) for declarative infrastructure deployment.
  • Cloud Build, Cloud Deploy, or integration with GitHub Actions for CI/CD pipelines.
  • Ability to automate infrastructure provisioning, policy enforcement, and environment standardization.

Support for GenAI Architectures:

  • Experience deploying and optimizing infrastructure for:
  • LLM hosting using Triton Inference Server, vLLM, or Text Generation Inference on GKE or Compute Engine.
  • Vector database integrations (Weaviate, ChromaDB, FAISS) with GCS and BigQuery.
  • RAG pipeline infrastructure including document ingestion (e.g., via Pub/Sub, Cloud Functions) and scalable retrieval.
  • Multi-agent frameworks like LangGraph, CrewAI, or AutoGen, with secure multi-service orchestration across GCP services.

Observability, Security, and Governance
Monitoring & observability stack:

  • Cloud Monitoring, Cloud Logging, Cloud Trace, Profiler, and Error Reporting for full-stack visibility.
  • Experience setting up custom dashboards, alerts, and uptime checks.
  • Security and compliance capabilities:
  • VPC Service Controls, Shielded VMs, Confidential Computing, and data encryption strategies (at rest and in transit).
  • Experience with cloud security posture management (CSPM) and compliance frameworks (e.g., HIPAA, SOC 2, FedRAMP).

Governance:

  • Experience setting up Organization Policies, Folder Hierarchies, and Cloud Asset Inventory for enterprise governance.

Cost Optimization & Resource Efficiency:
Proven ability to:

  • Monitor and optimize spend using Billing Reports, Cost Table Reports, Budgets, and Recommendations Hub.
  • Implement rightsizing recommendations, sustained use discounts, and committed use contracts (CUDs) for GPU workloads.
  • Design cost-aware architecture balancing performance, latency, and throughput for GenAI use cases.

Soft Skills & Personality Traits:

  • Strong problem-solving and debugging skills.
  • Ability to communicate technical concepts clearly to non-technical stakeholders.
  • Collaborative mindset with ability to work cross-functionally across AI, DevOps, and business teams.
  • Detail-oriented, with a focus on reliability, scalability, and security.

Preferred:

  • GCP Professional Cloud Architect, AWS Solutions Architect, or similar certifications.
  • Familiarity with GPUs (NVIDIA A100, H100), inference acceleration, and edge deployments
  • Familiarity with AI/ML governance, compliance, and ethical AI frameworks.

What is in it for you:

  • Be part of a team and company that has won NVIDIA's AI Services Partner of the Year three times in a row with an unparalleled track record of building production AI applications on DGX and Cloud GPUs.
  • Strong peer learning which will accelerate your learning curve across Applied AI, GPU Computing and other softer aspects such as technical communication.
  • Exposure to working with highly experienced AI leaders at Fortune 500 companies and innovative market disruptors looking to transform their business with Generative AI.
  • Access to state-of-the-art GPU infrastructure on the cloud and on-premise.
  • Be part of the fastest-growing AI-first digital transformation and engineering company in the world.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Detail Oriented
  • Collaboration
  • Communication
  • Problem Solving

Infrastructure Architect Related jobs