Logo for Cloudester Software LLC

Senior AIML Engineer

Job description

This is a remote position.

We are seeking a AI/ML Engineer to lead the architecture and deployment of large-scale production AI systems. This is not a "Prompt Engineering" or "API Integration" role. You will be expected to go "under the hood" of Large Language Models (LLMs), modifying model weights, optimizing inference kernels, and building autonomous agentic workflows.

The ideal candidate understands the mathematical mechanics of transformers and can balance deep research-level engineering with production-grade MLOps.

Location: Remote (India-based)

Time Zone: Thailand Standard Time (ICT)

Type: 2-Year Contract (Reviewed every 3 months)

Experience Senior AI Engineer: 5-7 Years
Experience AI Lead Engineer: 8-10 Years


Job role:

1. Generative AI & LLM Engineering
Deep Fine-tuning: Specialize LLMs (Llama 3, Mistral, Qwen) using PEFT techniques including LoRA, QLoRA, and IA3.

Internals & Optimization: Hands-on management of KV-cache, attention mechanisms, and prefill/decoding optimization to reduce latency and VRAM usage.

Alignment Research: Implement and iterate on alignment strategies such as DPO (Direct Preference Optimization), PPO, and RLHF.

Advanced RAG & Agents: Build robust RAG systems and autonomous agentic workflows using LangGraph/MCP for complex tasks like web and video trend extraction.

2. Search, Ranking & Personalization
Ranking Systems: Design and implement ranking algorithms and search relevance modules to improve information retrieval.

Recommendation Engines: Apply collaborative filtering and personalization solutions to large-scale user datasets.

Traditional ML Mastery: Utilize XGBoost, LightGBM, CNNs, and clustering techniques where LLMs are not the optimal tool.

3. Scalable MLOps & Data
Production Deployment: Own the end-to-end lifecycle on SageMaker, Ray, or Vertex AI, including CI/CD for ML systems.

Inference Platforms: Implement quantization (GGUF, AWQ) and scaling strategies to support production-grade AI solutions.

Data Pipelines: Build and maintain high-throughput data pipelines, perform feature engineering, and write optimized SQL.


Requirements

Programming Mastery: Expert-level Python (PyTorch, JAX, or TensorFlow) and a strong understanding of C++ for performance-critical components.

Deep Learning: Mastery of transformer architectures, embedding, and traditional ML (CNNs, Clustering).

System Design: Proven track record of operating production ML systems supporting high-concurrency users.

Data Proficiency: Strong SQL and data engineering skills to support large-scale feature stores.

Bachelor’s or Master’s degree in Computer Science. 

Related jobs

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.