Skip to main content

Easily implement RAG workflows with pre-built modules and context expansion.

Project description

easy_rag_llm

CAUTION

  • easy-rag-llm==1.0.* version is testing version. These versions are usually invalid.

๐Ÿ‡ฐ๐Ÿ‡ท ์†Œ๊ฐœ

  • easy_rag_llm๋Š” OpenAI ๋ฐ DeepSeek ๋ชจ๋ธ์„ ์ง€์›ํ•˜๋Š” ๊ฐ„๋‹จํ•œ RAG(์ •๋ณด ๊ฒ€์ƒ‰ ๋ฐ ์ƒ์„ฑ) ๊ธฐ๋ฐ˜ ์„œ๋น„์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ฐ„๋‹จํ•˜๊ฒŒ RAG LLM์„ ์„œ๋น„์Šค์— ํ†ตํ•ฉ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋„๋ก ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค.
  • (2025.01.16 ๊ธฐ์ค€/ v1.1.0) ํ•™์Šต๊ฐ€๋Šฅํ•œ ์ž๋ฃŒ ํฌ๋งท์€ PDF์ž…๋‹ˆ๋‹ค.

๐Ÿ‡บ๐Ÿ‡ธ Introduction

  • easy_rag_llm is a lightweight RAG-based service that supports both OpenAI and DeepSeek models. It is designed to seamlessly integrate RAG-based LLM functionalities into your service.
  • As of 2025-01-15 (v1.1.0), the supported resource format for training is PDF.

Usage

Install (https://pypi.org/project/easy-rag-llm/)

pip install easy_rag_llm

How to integrate to your service?

from easy_rag import RagService

# Basic initialization
rs = RagService(
    embedding_model="text-embedding-3-small", #Fixed to OpenAI model
    response_model="deepseek-chat",  # Options: "openai" or "deepseek-chat"
    open_api_key="your_openai_api_key_here",
    deepseek_api_key="your_deepseek_api_key_here",
    deepseek_base_url="https://api.deepseek.com",
    context_expansion=False,  # Enable/disable context expansion
    expansion_window=1  # Number of chunks to include before and after
)

# Example with OpenAI chat model
rs2 = RagService(
    embedding_model="text-embedding-3-small",
    response_model="gpt-3.5-turbo",
    open_api_key="your_openai_api_key_here",
)

# Resource Loading Parameters
resource = rs.rsc(
    "./rscFiles",
    force_update=False,  # Force rebuild index
    chunkers=10,  # Number of parallel chunking workers
    embedders=10,  # Number of parallel embedding workers
    ef_construction=200,  # HNSW index construction parameter
    ef_search=100,  # HNSW search parameter
    M=48  # HNSW graph parameter
)

# Generate Response with Context Expansion
query = "Explain what is taught in the third week's lecture."
response, top_evidence = rs.generate_response(
    resource,
    query,
    evidence_num=5,  # Number of evidence chunks to retrieve (default: 3)
    context_expansion=True,  # Enable context expansion for this query
    expansion_window=2  # Include 2 chunks before and after
)

print(response)

# Change Context Expansion Settings
rs.set_context_expansion(enable=True, window_size=2)

๐Ÿ‡ฐ๐Ÿ‡ท ์•ˆ๋‚ด

  • pdf ์ œ๋ชฉ์„ ๋ช…ํ™•ํ•˜๊ฒŒ ์ ์–ด์ฃผ์„ธ์š”. ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์—๋Š” pdf์ œ๋ชฉ์ด ์ถ”์ถœ๋˜์–ด ๋“ค์–ด๊ฐ€๋ฉฐ, ๋‹ต๋ณ€ ๊ทผ๊ฑฐ๋ฅผ ์ถœ๋ ฅํ• ๋•Œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • rs.rsc("./folder") ์ž‘๋™์‹œ faiss_index.bin๊ณผ metadata.json์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ดํ›„์—” ์ด๋ฏธ ๋งŒ๋“ค์–ด์ง„ .bin๊ณผ .json์œผ๋กœ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋งŒ์•ฝ ํด๋”์— ์ƒˆ๋กœ์šด ํŒŒ์ผ์„ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ์ œ๊ฑฐํ•˜์—ฌ ๋ณ€๊ฒฝํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด force_update=True๋กœ ์„ค์ •ํ•˜์—ฌ ๊ฐ•์ œ์—…๋ฐ์ดํŠธ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • chunkers๋Š” pdf ๋ถ„ํ•  ๋ณ‘๋ ฌ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๋™์‹œ์ž‘์—… ๊ฐœ์ˆ˜์ด๊ณ , embedders๋Š” ์ž„๋ฒ ๋”ฉ ์ž‘์—… ๋ณ‘๋ ฌ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๋™์‹œ์ž‘์—… ๊ฐœ์ˆ˜์ž…๋‹ˆ๋‹ค. ๋‘˜๋‹ค ๊ธฐ๋ณธ๊ฐ’ 10์œผ๋กœ ๊ฐ๊ฐ CPU ์ฝ”์–ด๊ฐœ์ˆ˜์™€ api ratelimit์— ์˜ํ–ฅ์„ ๋ฐ›์œผ๋ฏ€๋กœ ์ ์ ˆํžˆ ์กฐ์ ˆํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
  • context_expansion ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฒ€์ƒ‰๋œ ๋ฌธ๋งฅ์˜ ์•ž๋’ค ์ฒญํฌ๋ฅผ ํฌํ•จํ•˜์—ฌ ๋” ๋„“์€ ๋งฅ๋ฝ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. expansion_window๋กœ ์•ž๋’ค๋กœ ๋ช‡ ๊ฐœ์˜ ์ฒญํฌ๋ฅผ ํฌํ•จํ• ์ง€ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ‡บ๐Ÿ‡ธ Note

  • Ensure that your PDFs have clear titles. Extracted titles from the PDF metadata are used during training and for generating evidence-based responses.
  • Running rs.rsc("./folder") generates faiss_index.bin and metadata.json files. Subsequently, the system uses the existing .bin and .json files to generate responses. If you want to reflect changes by adding or removing files in the folder, you can enable forced updates by setting force_update=True.
  • The chunkers parameter controls parallel processing for PDF chunking, while embedders controls parallel processing for embedding generation. Both default to 10 and should be adjusted based on CPU cores and API rate limits.
  • The context expansion feature allows including surrounding chunks for better context understanding. Use expansion_window to control how many chunks to include before and after.

Advanced Parameters

Resource Loading (rs.rsc)

  • force_update: Force rebuild of index (default: False)
  • chunkers: Number of parallel PDF chunking workers (default: 10)
  • embedders: Number of parallel embedding workers (default: 10)
  • ef_construction: HNSW index construction parameter (default: 200)
  • ef_search: HNSW search parameter (default: 100)
  • M: HNSW graph parameter (default: 48)

Response Generation (generate_response)

  • evidence_num: Number of evidence chunks to retrieve (default: 3)
  • context_expansion: Enable/disable context expansion for the query
  • expansion_window: Number of chunks to include before and after

release version.

  • 1.0.12 : Supported. However, the embedding model and chat model are fixed to OpenAI's text-embedding-3-small and deepseek-chat, respectively. Fixed at threadpool worker=10, which may cause errors in certain environments.
  • v1.1.5 : recommend.

UML

Execution flow

sequenceDiagram
    actor User
    participant Agent
    participant OpenAI
    participant DeepSeek
    participant VectorDB

    User->>Agent: Input Query
    
    rect rgb(200, 220, 255)
        note over Agent: Embedding Generation Phase
        Agent->>OpenAI: Request Embedding (real_query_embedding_fn)
        OpenAI-->>Agent: Return Embedding Vector
    end

    rect rgb(220, 240, 220)
        note over Agent: Document Retrieval Phase
        Agent->>VectorDB: Similarity Search (index.search)
        VectorDB-->>Agent: Return Relevant Document Indices
        Agent->>Agent: Extract Documents from Metadata
        Agent->>Agent: Format Evidence as JSON
    end

    rect rgb(255, 220, 220)
        note over Agent: Response Generation Phase
        alt Using OpenAI Model
            Agent->>OpenAI: Request Chat Completion
            OpenAI-->>Agent: Generate Response
        else Using DeepSeek Model
            Agent->>DeepSeek: Request Chat Completion
            DeepSeek-->>Agent: Generate Response
        end
    end

    Agent-->>User: Return Final Response and Evidence

TODO

  • ์ฒญํฌ๋ฅผ ๋‚˜๋ˆ„๋Š” ๋ฐฉ์‹์— ๋Œ€ํ•œ ๊ฐœ์„ .
  • ์ž…๋ ฅํฌ๋งท ๋‹ค์–‘ํ™”. pdf์™ธ ์ง€์›. (v1.2.0 ~)

What can you do with this?

https://github.com/Aiden-Kwak/ClimateJudgeLLM

Release Message

v1.1.5
: Vector Searching method is changed to HNSW.
: The speed of vector embedding has been significantly improved, reducing the time by 90%, making it 10 times faster than before! (10sec for 500 page PDF.)
v1.1.6
: Added context expansion feature for better document understanding : Added version requirements for dependencies : Improved package metadata and documentation

Author Information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easy_rag_llm-1.1.6.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

easy_rag_llm-1.1.6-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file easy_rag_llm-1.1.6.tar.gz.

File metadata

  • Download URL: easy_rag_llm-1.1.6.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for easy_rag_llm-1.1.6.tar.gz
Algorithm Hash digest
SHA256 1fad291abe3bb6a1fd2600fff5bb4bdeccb34bf1be06eb87c52d69125bdc61e2
MD5 1a09afc2e9b7af491c0bc92a19da2bfc
BLAKE2b-256 f8cd23918754a4ca418b328e58ac029728aae57aed83ad665a82c910e6f16c02

See more details on using hashes here.

File details

Details for the file easy_rag_llm-1.1.6-py3-none-any.whl.

File metadata

  • Download URL: easy_rag_llm-1.1.6-py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for easy_rag_llm-1.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 8fec4ca36667a5a72df65dc32dd5fa73f5c0d3360b9864a55709cf176f990162
MD5 89c145edd77a6191ab15a208145db92c
BLAKE2b-256 afecab635c3f884efc6301fe0bc43dc4af8b8aee7a0f4c3476a7798d2f57662c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page