We are seeking a skilled LLM Engineer proficient in Python programming and experienced in developing, deploying, and optimizing large language models (LLMs). The ideal candidate will have hands-on experience with FastAPI or Flask frameworks, Lang Chain implementation, and building Retrieval-Augmented Generation (RAG) pipelines. You will play a key role in integrating cutting-edge AI technologies to solve complex business problems, focusing on vector stores and retrievers while deploying scalable solutions on AWS.
Key Responsibilities
1. Python Development:
a. Design, develop, and maintain scalable web services using FastAPI or Flask frameworks.
b. Write efficient, reusable, and modular Python code to support API-driven LLM applications.
2. Lang Chain & Supporting Frameworks:
a. Implement Lang Chain to build custom pipelines for document indexing, retrieval, and summarization.
b. Integrate Lang Chain's RAG capabilities with other components like vector stores and retrievers to support real-time querying and document processing.
3. RAG Pipelines:
a. Architect and deploy Retrieval-Augmented Generation (RAG) systems for chatbots, knowledge systems, and other generative AI applications.
b. Optimize RAG systems for speed, accuracy, and scalability across multiple use cases.
4. Vector Stores & Retrievers:
a. Work with vector databases like Pinecone, Chroma, FAISS, or Milvus to store and manage embeddings.
b. Implement retrievers and re-rankers to improve query efficiency, ensuring high-quality and relevant outputs for users.
5. AWS Cloud Deployment:
a. Deploy and manage LLM-based applications on AWS, leveraging services such as Lambda, EC2, S3, EKS, and RDS.
b. Ensure the scalability, availability, and reliability of deployed applications.
6. Dashboards and Monitoring (Optional):
a. Create monitoring dashboards using tools like Grafana or Tableau for real-time system monitoring, analytics, and performance insights.
7. Experimentation with Generative AI:
a. Research and integrate the latest advancements in generative AI technologies.
b. Experiment with fine-tuning and adapting large language models (like GPT, BERT) for new, innovative use cases.
Required Technical Skills
· Python proficiency, especially with web frameworks like FastAPI or Flask.
· Strong experience with Lang Chain and associated libraries.
· Proven expertise in building and optimizing RAG pipelines.
· Proficiency in using vector databases (e.g., Pinecone, FAISS).
· Experience with retrievers and re-rankers.
· Solid understanding of AWS services (Lambda, EC2, RDS, etc.).
· Knowledge of SQL and NoSQL databases.
· Familiarity with dashboarding tools such as Grafana and Tableau.
Soft Skills
· Problem-solving: Ability to handle complex and dynamic challenges with AI solutions.
· Collaboration: Experience working in multidisciplinary teams (data scientists, DevOps, etc.).
· Adaptability: Eagerness and passion to keep up with the latest AI advancements and incorporate them into solutions.
· Communication: Excellent verbal and written communication skills to convey technical information to both technical and non-technical stakeholders.
This role is ideal for engineers who are passionate about pushing the boundaries of generative AI and have the technical skills to create cutting-edge, deployable solutions.

NCR Atleos

Lessen

ATG Europe

POWER Engineers

Deutsche Postbank Group

Sky Systems, Inc. (SkySys)

Sky Systems, Inc. (SkySys)

Sky Systems, Inc. (SkySys)