Key Facts

Remote From:

Brazil

Full time

English

Hard Skills

Other Skills

•
Reliability
•
Collaboration
•
Communication
•
Adaptability
•
Teamwork
•
Problem Solving

Roles & Responsibilities

Proficiency in Python with FastAPI or Flask
Strong experience with LangChain and building RAG pipelines
Experience with vector stores and retrievers (e.g., Pinecone, FAISS, Chroma, Milvus)
Experience deploying LLM-based applications on AWS (Lambda, EC2, S3, EKS, RDS) and familiarity with SQL/NoSQL

Requirements:

Design, develop, and maintain scalable web services with FastAPI or Flask; write modular Python code for API-driven LLM apps.
Implement LangChain to build custom pipelines for document indexing, retrieval, and summarization; integrate RAG with vector stores and retrievers for real-time querying.
Architect and deploy Retrieval-Augmented Generation (RAG) systems for chatbots and knowledge bases; optimize for speed, accuracy, and scalability.
Work with vector stores and retrievers; implement retrievers and re-rankers to improve query efficiency and result relevance.

Sky Systems, Inc. (SkySys)

Information Technology & Services

About Sky Systems, Inc. (SkySys)

Sky Systems, Inc. (DBA SkySys) is a technology consulting firm based out of Research Triangle Park, North Carolina, United States. SkySys specializes in Recruitment & Staffing, 24/7 On-Site & Remote Services, Managed Services Provider (MSP), Cisco Select Certified Partner and Dell Technology Partner, Contact Center Solutions (Cisco, Avaya, Genesys), Web Solutions, and other services. SkySys currently works with clients across the United States and Canada. Our list of clients include top Fortune 500 companies in various industries – Financial Services, Banking, Pharmaceutical, IT Service Providers, Healthcare, Oil & Gas, Government, Consulting and Outsourcing, Telecommunications, Insurance, Aerospace, Semiconductors, and many more.

Company type: Startup

Industry: Information Technology & Services

Founded: 2018

Company size: 11 - 50

Website LinkedIn See all jobs →

Job description

Role: LLM Engineer
Position Type: Full-Time Contract (40hrs/week)
Contract Duration: 1 year+
Work Schedule: 8 hours/day (Mon-Fri)
Work Time: US Time
Location: 100% Remote (Candidates can work from anywhere in LATAM)

We are seeking a skilled LLM Engineer proficient in Python programming and experienced in developing, deploying, and optimizing large language models (LLMs). The ideal candidate will have hands-on experience with FastAPI or Flask frameworks, Lang Chain implementation, and building Retrieval-Augmented Generation (RAG) pipelines. You will play a key role in integrating cutting-edge AI technologies to solve complex business problems, focusing on vector stores and retrievers while deploying scalable solutions on AWS.

Key Responsibilities

1. Python Development:
a. Design, develop, and maintain scalable web services using FastAPI or Flask frameworks.
b. Write efficient, reusable, and modular Python code to support API-driven LLM applications.

2. Lang Chain & Supporting Frameworks:
a. Implement Lang Chain to build custom pipelines for document indexing, retrieval, and summarization.
b. Integrate Lang Chain's RAG capabilities with other components like vector stores and retrievers to support real-time querying and document processing.

3. RAG Pipelines:
a. Architect and deploy Retrieval-Augmented Generation (RAG) systems for chatbots, knowledge systems, and other generative AI applications.
b. Optimize RAG systems for speed, accuracy, and scalability across multiple use cases.

4. Vector Stores & Retrievers:
a. Work with vector databases like Pinecone, Chroma, FAISS, or Milvus to store and manage embeddings.
b. Implement retrievers and re-rankers to improve query efficiency, ensuring high-quality and relevant outputs for users.

5. AWS Cloud Deployment:
a. Deploy and manage LLM-based applications on AWS, leveraging services such as Lambda, EC2, S3, EKS, and RDS.
b. Ensure the scalability, availability, and reliability of deployed applications.

6. Dashboards and Monitoring (Optional):
a. Create monitoring dashboards using tools like Grafana or Tableau for real-time system monitoring, analytics, and performance insights.

7. Experimentation with Generative AI:
a. Research and integrate the latest advancements in generative AI technologies.
b. Experiment with fine-tuning and adapting large language models (like GPT, BERT) for new, innovative use cases.

Required Technical Skills

· Python proficiency, especially with web frameworks like FastAPI or Flask.
· Strong experience with Lang Chain and associated libraries.
· Proven expertise in building and optimizing RAG pipelines.
· Proficiency in using vector databases (e.g., Pinecone, FAISS).
· Experience with retrievers and re-rankers.
· Solid understanding of AWS services (Lambda, EC2, RDS, etc.).
· Knowledge of SQL and NoSQL databases.
· Familiarity with dashboarding tools such as Grafana and Tableau.

Soft Skills

· Problem-solving: Ability to handle complex and dynamic challenges with AI solutions.
· Collaboration: Experience working in multidisciplinary teams (data scientists, DevOps, etc.).
· Adaptability: Eagerness and passion to keep up with the latest AI advancements and incorporate them into solutions.
· Communication: Excellent verbal and written communication skills to convey technical information to both technical and non-technical stakeholders.

This role is ideal for engineers who are passionate about pushing the boundaries of generative AI and have the technical skills to create cutting-edge, deployable solutions.

Ready to apply?

APPLY

Share ·