This is a remote position.
We are seeking a AI/ML Engineer to lead the architecture and deployment of large-scale production AI systems. This is not a "Prompt Engineering" or "API Integration" role. You will be expected to go "under the hood" of Large Language Models (LLMs), modifying model weights, optimizing inference kernels, and building autonomous agentic workflows.
The ideal candidate understands the mathematical mechanics of transformers and can balance deep research-level engineering with production-grade MLOps.
Location: Remote (India-based)
Time Zone: Thailand Standard Time (ICT)
Type: 2-Year Contract (Reviewed every 3 months)
Experience Senior AI Engineer: 5-7 Years
Experience AI Lead Engineer: 8-10 Years
Job role:
1. Generative AI & LLM Engineering
Deep Fine-tuning: Specialize LLMs (Llama 3, Mistral, Qwen) using PEFT techniques including LoRA, QLoRA, and IA3.
Internals & Optimization: Hands-on management of KV-cache, attention mechanisms, and prefill/decoding optimization to reduce latency and VRAM usage.
Alignment Research: Implement and iterate on alignment strategies such as DPO (Direct Preference Optimization), PPO, and RLHF.
Advanced RAG & Agents: Build robust RAG systems and autonomous agentic workflows using LangGraph/MCP for complex tasks like web and video trend extraction.
2. Search, Ranking & Personalization
Ranking Systems: Design and implement ranking algorithms and search relevance modules to improve information retrieval.
Recommendation Engines: Apply collaborative filtering and personalization solutions to large-scale user datasets.
Traditional ML Mastery: Utilize XGBoost, LightGBM, CNNs, and clustering techniques where LLMs are not the optimal tool.
3. Scalable MLOps & Data
Production Deployment: Own the end-to-end lifecycle on SageMaker, Ray, or Vertex AI, including CI/CD for ML systems.
Inference Platforms: Implement quantization (GGUF, AWQ) and scaling strategies to support production-grade AI solutions.
Data Pipelines: Build and maintain high-throughput data pipelines, perform feature engineering, and write optimized SQL.
Requirements
Programming Mastery: Expert-level Python (PyTorch, JAX, or TensorFlow) and a strong understanding of C++ for performance-critical components.
Deep Learning: Mastery of transformer architectures, embedding, and traditional ML (CNNs, Clustering).
System Design: Proven track record of operating production ML systems supporting high-concurrency users.
Data Proficiency: Strong SQL and data engineering skills to support large-scale feature stores.
Bachelor’s or Master’s degree in Computer Science.