Easily implement RAG workflows with pre-built modules and context expansion.
Project description
easy_rag_llm
CAUTION
- easy-rag-llm==1.0.* version is testing version. These versions are usually invalid.
๐ฐ๐ท ์๊ฐ
- easy_rag_llm๋ OpenAI ๋ฐ DeepSeek ๋ชจ๋ธ์ ์ง์ํ๋ ๊ฐ๋จํ RAG(์ ๋ณด ๊ฒ์ ๋ฐ ์์ฑ) ๊ธฐ๋ฐ ์๋น์ค๋ฅผ ์ ๊ณตํฉ๋๋ค. ๊ฐ๋จํ๊ฒ RAG LLM์ ์๋น์ค์ ํตํฉ์ํฌ ์ ์๋๋ก ๋ง๋ค์ด์ก์ต๋๋ค.
- (2025.01.16 ๊ธฐ์ค/ v1.1.0) ํ์ต๊ฐ๋ฅํ ์๋ฃ ํฌ๋งท์ PDF์ ๋๋ค.
๐บ๐ธ Introduction
- easy_rag_llm is a lightweight RAG-based service that supports both OpenAI and DeepSeek models. It is designed to seamlessly integrate RAG-based LLM functionalities into your service.
- As of 2025-01-15 (v1.1.0), the supported resource format for training is PDF.
Usage
Install (https://pypi.org/project/easy-rag-llm/)
pip install easy_rag_llm
How to integrate to your service?
from easy_rag import RagService
# Basic initialization
rs = RagService(
embedding_model="text-embedding-3-small", #Fixed to OpenAI model
response_model="deepseek-chat", # Options: "openai" or "deepseek-chat"
open_api_key="your_openai_api_key_here",
deepseek_api_key="your_deepseek_api_key_here",
deepseek_base_url="https://api.deepseek.com",
context_expansion=False, # Enable/disable context expansion
expansion_window=1 # Number of chunks to include before and after
)
# Example with OpenAI chat model
rs2 = RagService(
embedding_model="text-embedding-3-small",
response_model="gpt-3.5-turbo",
open_api_key="your_openai_api_key_here",
)
# Resource Loading Parameters
resource = rs.rsc(
"./rscFiles",
force_update=False, # Force rebuild index
chunkers=10, # Number of parallel chunking workers
embedders=10, # Number of parallel embedding workers
ef_construction=200, # HNSW index construction parameter
ef_search=100, # HNSW search parameter
M=48 # HNSW graph parameter
)
# Generate Response with Context Expansion
query = "Explain what is taught in the third week's lecture."
response, top_evidence = rs.generate_response(
resource,
query,
evidence_num=5, # Number of evidence chunks to retrieve (default: 3)
context_expansion=True, # Enable context expansion for this query
expansion_window=2 # Include 2 chunks before and after
)
print(response)
# Change Context Expansion Settings
rs.set_context_expansion(enable=True, window_size=2)
๐ฐ๐ท ์๋ด
- pdf ์ ๋ชฉ์ ๋ช ํํ๊ฒ ์ ์ด์ฃผ์ธ์. ๋ฉํ๋ฐ์ดํฐ์๋ pdf์ ๋ชฉ์ด ์ถ์ถ๋์ด ๋ค์ด๊ฐ๋ฉฐ, ๋ต๋ณ ๊ทผ๊ฑฐ๋ฅผ ์ถ๋ ฅํ ๋ ์ ์ฉํ๊ฒ ์ฌ์ฉ๋ ์ ์์ต๋๋ค.
rs.rsc("./folder")์๋์faiss_index.bin๊ณผmetadata.json์ด ์์ฑ๋ฉ๋๋ค. ์ดํ์ ์ด๋ฏธ ๋ง๋ค์ด์ง .bin๊ณผ .json์ผ๋ก ๋ต๋ณ์ ์์ฑํฉ๋๋ค. ๋ง์ฝ ํด๋์ ์๋ก์ด ํ์ผ์ ์ถ๊ฐํ๊ฑฐ๋ ์ ๊ฑฐํ์ฌ ๋ณ๊ฒฝํ๊ณ ์ถ๋ค๋ฉดforce_update=True๋ก ์ค์ ํ์ฌ ๊ฐ์ ์ ๋ฐ์ดํธ๊ฐ ๊ฐ๋ฅํฉ๋๋ค.- chunkers๋ pdf ๋ถํ ๋ณ๋ ฌ์ฒ๋ฆฌ๋ฅผ ์ํ ๋์์์ ๊ฐ์์ด๊ณ , embedders๋ ์๋ฒ ๋ฉ ์์ ๋ณ๋ ฌ์ฒ๋ฆฌ๋ฅผ ์ํ ๋์์์ ๊ฐ์์ ๋๋ค. ๋๋ค ๊ธฐ๋ณธ๊ฐ 10์ผ๋ก ๊ฐ๊ฐ CPU ์ฝ์ด๊ฐ์์ api ratelimit์ ์ํฅ์ ๋ฐ์ผ๋ฏ๋ก ์ ์ ํ ์กฐ์ ํด์ผํฉ๋๋ค.
- context_expansion ๊ธฐ๋ฅ์ ์ฌ์ฉํ๋ฉด ๊ฒ์๋ ๋ฌธ๋งฅ์ ์๋ค ์ฒญํฌ๋ฅผ ํฌํจํ์ฌ ๋ ๋์ ๋งฅ๋ฝ์ ์ ๊ณตํ ์ ์์ต๋๋ค. expansion_window๋ก ์๋ค๋ก ๋ช ๊ฐ์ ์ฒญํฌ๋ฅผ ํฌํจํ ์ง ์ค์ ํ ์ ์์ต๋๋ค.
๐บ๐ธ Note
- Ensure that your PDFs have clear titles. Extracted titles from the PDF metadata are used during training and for generating evidence-based responses.
- Running
rs.rsc("./folder")generatesfaiss_index.binandmetadata.jsonfiles. Subsequently, the system uses the existing .bin and .json files to generate responses. If you want to reflect changes by adding or removing files in the folder, you can enable forced updates by settingforce_update=True. - The
chunkersparameter controls parallel processing for PDF chunking, whileembedderscontrols parallel processing for embedding generation. Both default to 10 and should be adjusted based on CPU cores and API rate limits. - The context expansion feature allows including surrounding chunks for better context understanding. Use
expansion_windowto control how many chunks to include before and after.
Advanced Parameters
Resource Loading (rs.rsc)
force_update: Force rebuild of index (default: False)chunkers: Number of parallel PDF chunking workers (default: 10)embedders: Number of parallel embedding workers (default: 10)ef_construction: HNSW index construction parameter (default: 200)ef_search: HNSW search parameter (default: 100)M: HNSW graph parameter (default: 48)
Response Generation (generate_response)
evidence_num: Number of evidence chunks to retrieve (default: 3)context_expansion: Enable/disable context expansion for the queryexpansion_window: Number of chunks to include before and after
release version.
- 1.0.12 : Supported. However, the embedding model and chat model are fixed to OpenAI's text-embedding-3-small and deepseek-chat, respectively. Fixed at threadpool worker=10, which may cause errors in certain environments.
- v1.1.5 : recommend.
UML
Execution flow
sequenceDiagram
actor User
participant Agent
participant OpenAI
participant DeepSeek
participant VectorDB
User->>Agent: Input Query
rect rgb(200, 220, 255)
note over Agent: Embedding Generation Phase
Agent->>OpenAI: Request Embedding (real_query_embedding_fn)
OpenAI-->>Agent: Return Embedding Vector
end
rect rgb(220, 240, 220)
note over Agent: Document Retrieval Phase
Agent->>VectorDB: Similarity Search (index.search)
VectorDB-->>Agent: Return Relevant Document Indices
Agent->>Agent: Extract Documents from Metadata
Agent->>Agent: Format Evidence as JSON
end
rect rgb(255, 220, 220)
note over Agent: Response Generation Phase
alt Using OpenAI Model
Agent->>OpenAI: Request Chat Completion
OpenAI-->>Agent: Generate Response
else Using DeepSeek Model
Agent->>DeepSeek: Request Chat Completion
DeepSeek-->>Agent: Generate Response
end
end
Agent-->>User: Return Final Response and Evidence
TODO
- ์ฒญํฌ๋ฅผ ๋๋๋ ๋ฐฉ์์ ๋ํ ๊ฐ์ .
- ์ ๋ ฅํฌ๋งท ๋ค์ํ. pdf์ธ ์ง์. (v1.2.0 ~)
What can you do with this?
https://github.com/Aiden-Kwak/ClimateJudgeLLM
Release Message
v1.1.5
: Vector Searching method is changed to HNSW.
: The speed of vector embedding has been significantly improved, reducing the time by 90%, making it 10 times faster than before! (10sec for 500 page PDF.)
v1.1.6
: Added context expansion feature for better document understanding
: Added version requirements for dependencies
: Improved package metadata and documentation
Author Information
- ๊ณฝ๋ณํ (https://github.com/Aiden-Kwak)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file easy_rag_llm-1.1.6.tar.gz.
File metadata
- Download URL: easy_rag_llm-1.1.6.tar.gz
- Upload date:
- Size: 16.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fad291abe3bb6a1fd2600fff5bb4bdeccb34bf1be06eb87c52d69125bdc61e2
|
|
| MD5 |
1a09afc2e9b7af491c0bc92a19da2bfc
|
|
| BLAKE2b-256 |
f8cd23918754a4ca418b328e58ac029728aae57aed83ad665a82c910e6f16c02
|
File details
Details for the file easy_rag_llm-1.1.6-py3-none-any.whl.
File metadata
- Download URL: easy_rag_llm-1.1.6-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fec4ca36667a5a72df65dc32dd5fa73f5c0d3360b9864a55709cf176f990162
|
|
| MD5 |
89c145edd77a6191ab15a208145db92c
|
|
| BLAKE2b-256 |
afecab635c3f884efc6301fe0bc43dc4af8b8aee7a0f4c3476a7798d2f57662c
|