While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.

If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!

About Quantiphi:
Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed.

Company Highlights:
Quantiphi has seen 2.5x growth YoY since its inception in 2013, we don’t just innovate - we lead. Headquartered in Boston, with 4,000+ Quantiphi professionals across the globe. As an Elite/Premier Partner for Google Cloud, AWS, NVIDIA, Snowflake, and others, we’ve been recognized with:

17x Google Cloud Partner of the Year awards in the last 8 years.
3x AWS AI/ML award wins.
3x NVIDIA Partner of the Year titles.
2x Snowflake Partner of the Year awards.
We have also garnered top analyst recognitions from Gartner, ISG, and Everest Group.
We offer first-in-class industry solutions across Healthcare, Financial Services, Consumer Goods, Manufacturing, and more, powered by cutting-edge Generative AI and Agentic AI accelerators.
We have been certified as a Great Place to Work for the third year in a row- 2021, 2022, 2023.

Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation. Your next big opportunity starts here!

For more details, visit: Website or LinkedIn Page.

Work Location: Dallas (preferred) but anywhere in US works.

Role Overview:

We are seeking a seasoned Infrastructure Architect with deep expertise in both cloud platforms and on-premise infrastructure to design, implement, and manage robust hybrid environments that can support high-compute AI and GenAI workloads.
You will work onsite with one of our key enterprise clients to assess existing infrastructure, define scalable architectures, and ensure optimal performance for AI/ML and GenAI solutions.
You’ll play a critical role in bridging infrastructure, DevOps, and AI solution delivery, ensuring our client has the right foundational stack to scale advanced AI workloads across their enterprise.

Key Responsibilities:

Hybrid Infrastructure Design & Deployment:

Architect and implement secure, scalable, and cost-effective infrastructure solutions across on-prem and cloud (GCP, AWS, Azure) environments.
Evaluate existing systems and define migration or integration strategies for deploying AI/GenAI workloads in hybrid setups.
Design infrastructure supporting GPU-intensive workloads, distributed training, inferencing, and vector database storage.

Cloud & On-Prem Operations:

Manage provisioning, automation, and orchestration across virtual machines, containers, and Kubernetes clusters.
Implement and monitor high-availability, low-latency, and disaster recovery strategies.
Optimize infrastructure for latency-sensitive applications, including real-time GenAI agentic workflows.

Collaboration & Enablement:

Work closely with AI/ML engineers, data scientists, solution architects, and DevOps to ensure smooth deployment and scaling of models and GenAI agents.
Recommend best practices on hybrid infrastructure for LLM fine-tuning, RAG architecture, and multi-agent orchestration platforms.
Guide teams on infrastructure security, IAM policies, and governance frameworks for GenAI applications.

Performance & Cost Optimization:

Continuously benchmark, profile, and optimize infrastructure for performance and efficiency.
Monitor resource utilization and propose capacity planning strategies for AI workload peaks.

Key Qualifications & Experience:

Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.
8–15 years of experience in enterprise infrastructure architecture, with significant experience in both on-prem and cloud-native environments.
Proven track record in designing and deploying AI/ML or GenAI-supporting infrastructure (e.g., GPU clusters, Kubernetes for ML workloads, hybrid vector databases).
Deep knowledge of cloud services (GCP preferred; AWS or Azure acceptable), on-prem virtualization, storage, networking, and container orchestration.
Experience supporting multi-agentic GenAI frameworks, including task orchestration, distributed agents, and workflow automation.
Hands-on experience in DevOps and IaC tools (Terraform, Helm, Ansible, CI/CD).
Familiarity with AI governance, data security, and compliance in hybrid environments.

Required Skills:

GCP Infrastructure Design & Deployment
Deep hands-on expertise in architecting and managing solutions on Google Cloud Platform, including:

VPC design, subnetting, firewall rules, Private Service Connect, and Cloud Interconnect for secure hybrid networking.
Identity & Access Management (IAM), Workload Identity Federation, and service accounts for secure access control across services.
Cloud Load Balancing, Cloud NAT, and Cloud Armor for high-availability, secure ingress/egress management.
Resource hierarchy and organization policies to manage large-scale enterprise GCP environments.

AI/GenAI-Centric Compute & Storage Architecture
Strong understanding of compute services tailored to GenAI:

Compute Engine for custom VM/GPU provisioning (A100/H100, T4).
GKE (Google Kubernetes Engine) for containerized model deployments, including support for GPU workloads and node auto-provisioning.
Vertex AI and Vertex AI Workbench for managing ML pipelines, training, model registry, and deployments.

Storage architecture experience with:

Cloud Storage (standard, nearline, coldline) for unstructured datasets.
Filestore, Local SSDs, and Persistent Disks for high-throughput model training and inferencing.
Integration with BigQuery and Spanner for structured data workloads supporting GenAI applications.

Containerization, Orchestration & IaC on GCP:

Advanced experience with GKE:
Cluster autoscaling, workload identity, taints/tolerations for GPU scheduling.
Helm-based deployments and integration with Artifact Registry.

Proficient in Infrastructure as Code using:

Terraform (with GCP provider modules) for declarative infrastructure deployment.
Cloud Build, Cloud Deploy, or integration with GitHub Actions for CI/CD pipelines.
Ability to automate infrastructure provisioning, policy enforcement, and environment standardization.

Support for GenAI Architectures:

Experience deploying and optimizing infrastructure for:
LLM hosting using Triton Inference Server, vLLM, or Text Generation Inference on GKE or Compute Engine.
Vector database integrations (Weaviate, ChromaDB, FAISS) with GCS and BigQuery.
RAG pipeline infrastructure including document ingestion (e.g., via Pub/Sub, Cloud Functions) and scalable retrieval.
Multi-agent frameworks like LangGraph, CrewAI, or AutoGen, with secure multi-service orchestration across GCP services.

Observability, Security, and Governance
Monitoring & observability stack:

Cloud Monitoring, Cloud Logging, Cloud Trace, Profiler, and Error Reporting for full-stack visibility.
Experience setting up custom dashboards, alerts, and uptime checks.
Security and compliance capabilities:
VPC Service Controls, Shielded VMs, Confidential Computing, and data encryption strategies (at rest and in transit).
Experience with cloud security posture management (CSPM) and compliance frameworks (e.g., HIPAA, SOC 2, FedRAMP).

Governance:

Experience setting up Organization Policies, Folder Hierarchies, and Cloud Asset Inventory for enterprise governance.

Cost Optimization & Resource Efficiency:
Proven ability to:

Monitor and optimize spend using Billing Reports, Cost Table Reports, Budgets, and Recommendations Hub.
Implement rightsizing recommendations, sustained use discounts, and committed use contracts (CUDs) for GPU workloads.
Design cost-aware architecture balancing performance, latency, and throughput for GenAI use cases.

Soft Skills & Personality Traits:

Strong problem-solving and debugging skills.
Ability to communicate technical concepts clearly to non-technical stakeholders.
Collaborative mindset with ability to work cross-functionally across AI, DevOps, and business teams.
Detail-oriented, with a focus on reliability, scalability, and security.

Preferred:

GCP Professional Cloud Architect, AWS Solutions Architect, or similar certifications.
Familiarity with GPUs (NVIDIA A100, H100), inference acceleration, and edge deployments
Familiarity with AI/ML governance, compliance, and ethical AI frameworks.

What is in it for you:

Be part of a team and company that has won NVIDIA's AI Services Partner of the Year three times in a row with an unparalleled track record of building production AI applications on DGX and Cloud GPUs.
Strong peer learning which will accelerate your learning curve across Applied AI, GPU Computing and other softer aspects such as technical communication.
Exposure to working with highly experienced AI leaders at Fortune 500 companies and innovative market disruptors looking to transform their business with Generative AI.
Access to state-of-the-art GPU infrastructure on the cloud and on-premise.
Be part of the fastest-growing AI-first digital transformation and engineering company in the world.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Infrastructure Architect (GCP)

Offer summary

Qualifications:

Key responsibilities:

Job description

Required profile

Experience

Hard Skills

Other Skills

Infrastructure Architect Related jobs

Infrastructure Architect (GCP)

Senior Azure Infrastructure Architect

Cloud and Infrastructure Architect

Consultor Senior de Infraestructura

Engineering Architect