Logo for Gramian Consulting

CUDA Developer (AI/LLM & GPU Optimization)

Key Facts

Remote From: 
Fixed term
Mid-level (2-5 years)
English

Other Skills

  • •
    Collaboration
  • •
    Communication
  • •
    Problem Solving

Roles & Responsibilities

  • 5+ years of professional software development experience with strong focus on CUDA development
  • Strong proficiency in C/C++
  • Strong hands-on experience with Python and scientific computing ecosystems
  • Experience with CUDA 12.3 or newer

Requirements:

  • Solve advanced CUDA and GPU programming problems involving parallel computing and performance optimization
  • Review, evaluate, and improve AI-generated CUDA, C++, and Python code
  • Optimize GPU kernels for throughput, latency, memory efficiency, and resource utilization
  • Work with CUDA libraries and frameworks such as Thrust, cuBLAS, and cuDNN

Job description

Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.

Role Overview

We are looking for experienced CUDA Developers to work on advanced AI and machine learning initiatives focused on improving the capabilities of large language models (LLMs). In this role, you will solve complex GPU programming challenges, optimize high-performance CUDA workloads, review AI-generated code, and contribute to the development of more capable AI systems.

Duration: 3 months

Commitment: 40h/week, 4h/day overlap with PST

Model: Contract, time and material

Location: 100% Remote: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Pakistan, Indonesia, Kenya, Nigeria, Turkey, Vietnam

Interview: 1 technical interview

Key Responsibilities

  • Solve advanced CUDA and GPU programming problems involving parallel computing and performance optimization
  • Review, evaluate, and improve AI-generated CUDA, C++, and Python code
  • Optimize GPU kernels for throughput, latency, memory efficiency, and resource utilization
  • Work with CUDA libraries and frameworks such as Thrust, cuBLAS, and cuDNN
  • Debug and resolve issues related to CUDA kernels, synchronization, and memory management
  • Develop high-quality technical prompts, solutions, explanations, and evaluations for AI model training
  • Collaborate with AI researchers, engineers, and evaluation teams
  • Stay up to date with the latest developments in CUDA, GPU architectures, and performance optimization techniques

Requirements

  • 5+ years of professional software development experience with strong focus on CUDA development
  • Strong proficiency in C/C++
  • Strong hands-on experience with Python and scientific computing ecosystems
  • Experience working with PyTorch and NumPy
  • Experience with CUDA 12.3 or newer
  • Strong understanding of GPU programming, parallel computing, and performance optimization
  • Experience optimizing workloads for high-performance execution and efficient resource utilization
  • Experience with CUDA libraries such as Thrust, cuBLAS, and cuDNN

Related jobs

Other jobs at Gramian Consulting

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

✨

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.