While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!
About Quantiphi:
Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed.
Company Highlights:
Quantiphi has seen 2.5x growth YoY since its inception in 2013, we don’t just innovate - we lead. Headquartered in Boston, with 4,000+ Quantiphi professionals across the globe. As an Elite/Premier Partner for Google Cloud, AWS, NVIDIA, Snowflake, and others, we’ve been recognized with:
- 17x Google Cloud Partner of the Year awards in the last 8 years.
- 3x AWS AI/ML award wins.
- 3x NVIDIA Partner of the Year titles.
- 2x Snowflake Partner of the Year awards.
- We have also garnered top analyst recognitions from Gartner, ISG, and Everest Group.
- We offer first-in-class industry solutions across Healthcare, Financial Services, Consumer Goods, Manufacturing, and more, powered by cutting-edge Generative AI and Agentic AI accelerators.
- We have been certified as a Great Place to Work for the third year in a row- 2021, 2022, 2023.
Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation. Your next big opportunity starts here!
For more details, visit: Website or LinkedIn Page.
Work Location: Dallas (preferred) but anywhere in US works.
Role Overview:
- We are seeking a seasoned Infrastructure Architect with deep expertise in both cloud platforms and on-premise infrastructure to design, implement, and manage robust hybrid environments that can support high-compute AI and GenAI workloads.
- You will work onsite with one of our key enterprise clients to assess existing infrastructure, define scalable architectures, and ensure optimal performance for AI/ML and GenAI solutions.
- You’ll play a critical role in bridging infrastructure, DevOps, and AI solution delivery, ensuring our client has the right foundational stack to scale advanced AI workloads across their enterprise.
Key Responsibilities:
Hybrid Infrastructure Design & Deployment:
- Architect and implement secure, scalable, and cost-effective infrastructure solutions across on-prem and cloud (GCP, AWS, Azure) environments.
- Evaluate existing systems and define migration or integration strategies for deploying AI/GenAI workloads in hybrid setups.
- Design infrastructure supporting GPU-intensive workloads, distributed training, inferencing, and vector database storage.
Cloud & On-Prem Operations:
- Manage provisioning, automation, and orchestration across virtual machines, containers, and Kubernetes clusters.
- Implement and monitor high-availability, low-latency, and disaster recovery strategies.
- Optimize infrastructure for latency-sensitive applications, including real-time GenAI agentic workflows.
Collaboration & Enablement:
- Work closely with AI/ML engineers, data scientists, solution architects, and DevOps to ensure smooth deployment and scaling of models and GenAI agents.
- Recommend best practices on hybrid infrastructure for LLM fine-tuning, RAG architecture, and multi-agent orchestration platforms.
- Guide teams on infrastructure security, IAM policies, and governance frameworks for GenAI applications.
Performance & Cost Optimization:
- Continuously benchmark, profile, and optimize infrastructure for performance and efficiency.
- Monitor resource utilization and propose capacity planning strategies for AI workload peaks.
Key Qualifications & Experience:
- Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.
- 8–15 years of experience in enterprise infrastructure architecture, with significant experience in both on-prem and cloud-native environments.
- Proven track record in designing and deploying AI/ML or GenAI-supporting infrastructure (e.g., GPU clusters, Kubernetes for ML workloads, hybrid vector databases).
- Deep knowledge of cloud services (GCP preferred; AWS or Azure acceptable), on-prem virtualization, storage, networking, and container orchestration.
- Experience supporting multi-agentic GenAI frameworks, including task orchestration, distributed agents, and workflow automation.
- Hands-on experience in DevOps and IaC tools (Terraform, Helm, Ansible, CI/CD).
- Familiarity with AI governance, data security, and compliance in hybrid environments.
Required Skills:
GCP Infrastructure Design & Deployment
Deep hands-on expertise in architecting and managing solutions on Google Cloud Platform, including:
- VPC design, subnetting, firewall rules, Private Service Connect, and Cloud Interconnect for secure hybrid networking.
- Identity & Access Management (IAM), Workload Identity Federation, and service accounts for secure access control across services.
- Cloud Load Balancing, Cloud NAT, and Cloud Armor for high-availability, secure ingress/egress management.
- Resource hierarchy and organization policies to manage large-scale enterprise GCP environments.
AI/GenAI-Centric Compute & Storage Architecture
Strong understanding of compute services tailored to GenAI:
- Compute Engine for custom VM/GPU provisioning (A100/H100, T4).
- GKE (Google Kubernetes Engine) for containerized model deployments, including support for GPU workloads and node auto-provisioning.
- Vertex AI and Vertex AI Workbench for managing ML pipelines, training, model registry, and deployments.
Storage architecture experience with:
- Cloud Storage (standard, nearline, coldline) for unstructured datasets.
- Filestore, Local SSDs, and Persistent Disks for high-throughput model training and inferencing.
- Integration with BigQuery and Spanner for structured data workloads supporting GenAI applications.
Containerization, Orchestration & IaC on GCP:
- Advanced experience with GKE:
- Cluster autoscaling, workload identity, taints/tolerations for GPU scheduling.
- Helm-based deployments and integration with Artifact Registry.
Proficient in Infrastructure as Code using:
- Terraform (with GCP provider modules) for declarative infrastructure deployment.
- Cloud Build, Cloud Deploy, or integration with GitHub Actions for CI/CD pipelines.
- Ability to automate infrastructure provisioning, policy enforcement, and environment standardization.
Support for GenAI Architectures:
- Experience deploying and optimizing infrastructure for:
- LLM hosting using Triton Inference Server, vLLM, or Text Generation Inference on GKE or Compute Engine.
- Vector database integrations (Weaviate, ChromaDB, FAISS) with GCS and BigQuery.
- RAG pipeline infrastructure including document ingestion (e.g., via Pub/Sub, Cloud Functions) and scalable retrieval.
- Multi-agent frameworks like LangGraph, CrewAI, or AutoGen, with secure multi-service orchestration across GCP services.
Observability, Security, and Governance
Monitoring & observability stack:
- Cloud Monitoring, Cloud Logging, Cloud Trace, Profiler, and Error Reporting for full-stack visibility.
- Experience setting up custom dashboards, alerts, and uptime checks.
- Security and compliance capabilities:
- VPC Service Controls, Shielded VMs, Confidential Computing, and data encryption strategies (at rest and in transit).
- Experience with cloud security posture management (CSPM) and compliance frameworks (e.g., HIPAA, SOC 2, FedRAMP).
Governance:
- Experience setting up Organization Policies, Folder Hierarchies, and Cloud Asset Inventory for enterprise governance.
Cost Optimization & Resource Efficiency:
Proven ability to:
- Monitor and optimize spend using Billing Reports, Cost Table Reports, Budgets, and Recommendations Hub.
- Implement rightsizing recommendations, sustained use discounts, and committed use contracts (CUDs) for GPU workloads.
- Design cost-aware architecture balancing performance, latency, and throughput for GenAI use cases.
Soft Skills & Personality Traits:
- Strong problem-solving and debugging skills.
- Ability to communicate technical concepts clearly to non-technical stakeholders.
- Collaborative mindset with ability to work cross-functionally across AI, DevOps, and business teams.
- Detail-oriented, with a focus on reliability, scalability, and security.
Preferred:
- GCP Professional Cloud Architect, AWS Solutions Architect, or similar certifications.
- Familiarity with GPUs (NVIDIA A100, H100), inference acceleration, and edge deployments
- Familiarity with AI/ML governance, compliance, and ethical AI frameworks.
What is in it for you:
- Be part of a team and company that has won NVIDIA's AI Services Partner of the Year three times in a row with an unparalleled track record of building production AI applications on DGX and Cloud GPUs.
- Strong peer learning which will accelerate your learning curve across Applied AI, GPU Computing and other softer aspects such as technical communication.
- Exposure to working with highly experienced AI leaders at Fortune 500 companies and innovative market disruptors looking to transform their business with Generative AI.
- Access to state-of-the-art GPU infrastructure on the cloud and on-premise.
- Be part of the fastest-growing AI-first digital transformation and engineering company in the world.
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!