Vietnamese Retrieval-Augmented Generation (RAG) Framework

These details have not been verified by PyPI

Project links

Project description

Vi-RAG Framework

Vietnamese Retrieval-Augmented Generation Framework

Một framework RAG toàn diện được thiết kế đặc biệt cho tiếng Việt, hỗ trợ xử lý tài liệu PDF, TXT, DOCX với khả năng chunking phân cấp và tìm kiếm ngữ nghĩa.

🌟 Tính Năng Chính

✅ Hỗ trợ đa định dạng: PDF, TXT, DOCX
✅ Chunking thông minh: Phân cấp parent-child chunks với overlap
✅ Vector Search: Tích hợp Qdrant cho tìm kiếm ngữ nghĩa
✅ Gemini Integration: Sử dụng Gemini API cho embedding và generation
✅ In-memory Caching: Cache DocumentNode để tăng tốc độ xử lý
✅ Auto-chunking: Tự động load và chunk documents trong 1 bước
✅ Tiếng Việt native: Được thiết kế tối ưu cho tiếng Việt

📁 Cấu Trúc Project

Vi-RAG/
├── src/
│   └── vi_rag/                  # Main package
│       ├── __init__.py
│       ├── core.py              # Core RAG functionality
│       ├── utils.py             # Utility functions
│       └── py.typed             # Type hints marker
pyproject.toml               # Project configuration
README.md                    # This file
LICENSE                      # MIT License
.gitignore

🚀 Cài Đặt Nhanh

0. Cài đặt Vi-RAG

pip install vi-rag

1. Clone Repository

git clone https://github.com/NOT-erorr/PBL_2025_Vi-RAG_framework.git
cd Vi-RAG

2. Tạo Virtual Environment

python -m venv venv

# Windows
venv\Scripts\activate

# Linux/Mac
source venv/bin/activate

3. Cấu Hình Environment Variables

# Copy file .env.example thành .env
cp .env.example .env

# Hoặc trên Windows
copy .env.example .env

Sau đó, mở file .env và điền các API keys của bạn:

# Required: Google Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Required: Qdrant Vector Database Configuration  
QDRANT_API_KEY=your_qdrant_api_key_here
QDRANT_URL=your_qdrant_url_here

Lưu ý: File .env chứa thông tin nhạy cảm và đã được thêm vào .gitignore. Không commit file này lên Git!

Hướng dẫn lấy API keys:

Gemini API Key: Google AI Studio
Qdrant: Qdrant Cloud

💡 Sử Dụng Cơ Bản

Example 1: Load và Chunk Document Tự Động

from vi_rag import DocumentLoader

# Auto-chunking (khuyến nghị)
loader = DocumentLoader(
    "document.pdf",
    auto_chunk=True,
    parent_size=2000,
    child_size=400,
    overlap=50
)

# Load và chunk trong 1 bước
document, parents, children = loader.load_and_chunk()

print(f"Loaded: {document.title}")
print(f"Parent chunks: {len(parents)}")
print(f"Child chunks: {len(children)}")

Example 2: Workflow Hoàn Chỉnh RAG

from vi_rag.ingestion import DocumentLoader
from vi_rag.models import GeminiEmbeddingModel, GeminiLLMClient
from vi_rag.retrieval import QdrantVectorStore
from vi_rag.config import settings
import uuid

# 1. Load và chunk document
loader = DocumentLoader("document.pdf", auto_chunk=True)
document, parents, children = loader.load_and_chunk()

# 2. Setup models (sử dụng settings từ .env)
embedding_model = GeminiEmbeddingModel(
    settings.GEMINI_API_KEY, 
    output_dimensionality=settings.EMBEDDING_DIM
)
llm = GeminiLLMClient(settings.GEMINI_API_KEY, model_name="gemini-2.0-flash-exp")

# 3. Generate embeddings
child_texts = [child['text'] for child in children]
vectors = embedding_model.embed_documents(child_texts)

# 4. Setup và index vào vector store
vector_store = QdrantVectorStore(
    api_key=settings.QDRANT_API_KEY, 
    url=settings.QDRANT_URL
)
vector_store.connect()
vector_store.ensure_collection()

# Add IDs
for child in children:
    child['id'] = str(uuid.uuid4())

vector_store.add_vectors(
    vectors=vectors,
    payloads=children,
    ids=[c['id'] for c in children]
)

# 5. Query và generate answer
question = "Tài liệu này nói về gì?"
query_vector = embedding_model.embed_query(question)
results = vector_store.search(query_vector, top_k=settings.VECTOR_TOP_K)
context = "\n\n".join([r['text'] for r in results])

answer = llm.generate(query=question, context=context)
print(f"Câu hỏi: {question}")
print(f"Trả lời: {answer}")

Example 3: Xử Lý Document Cache

from vi_rag.ingestion import DocumentLoader

loader = DocumentLoader("document.pdf")

# Check cache trước khi load
cached = loader.check_document_loaded()
if cached:
    print("Document đã được load trước đó!")
    document = cached
else:
    print("Loading document mới...")
    document = loader.load()

Example 4: Xử Lý Nhiều Documents

from vi_rag.ingestion import DocumentLoader
import uuid

documents = ["doc1.pdf", "doc2.txt", "doc3.docx"]
all_children = []

# Load tất cả documents
for doc_path in documents:
    loader = DocumentLoader(doc_path, auto_chunk=True)
    doc, parents, children = loader.load_and_chunk()
    
    # Add source metadata
    for child in children:
        child['id'] = str(uuid.uuid4())
        child['source_file'] = doc_path
    
    all_children.extend(children)

print(f"Total chunks from all documents: {len(all_children)}")

# Embed và index tất cả
texts = [c['text'] for c in all_children]
vectors = embedding_model.embed_documents(texts)
vector_store.add_vectors(vectors, all_children, [c['id'] for c in all_children])

Example 5: Load Document Không Auto-Chunk

from vi_rag.ingestion import DocumentLoader, HierarchicalChunker

# Load document only
loader = DocumentLoader("document.pdf", auto_chunk=False)
document, _, _ = loader.load_and_chunk()  # Empty lists returned

# Chunk thủ công sau
chunker = HierarchicalChunker(
    parent_size=3000,  # Custom size
    child_size=500,
    overlap=100
)
parents, children = chunker.build_chunks(document)

Example 6: Query với Filtering

from qdrant_client.models import Filter, FieldCondition, MatchValue

# Search với filter theo source file
results = vector_store.client.search(
    collection_name=vector_store.collection_name,
    query_vector=query_vector,
    limit=5,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="source_file",
                match=MatchValue(value="important_doc.pdf")
            )
        ]
    )
)

Example 7: Multilingual - Tiếng Việt

from vi_rag.ingestion import DocumentLoader
from vi_rag.models import GeminiLLMClient
from vi_rag.config import settings

# Load Vietnamese document
loader = DocumentLoader("tai_lieu_tieng_viet.pdf", auto_chunk=True)
document, parents, children = loader.load_and_chunk()

# Query bằng tiếng Việt
question = "Nội dung chính của tài liệu là gì?"
results = vector_store.search(query_vector, top_k=settings.VECTOR_TOP_K)
context = "\n\n".join([r['text'] for r in results])

# Generate với instruction tiếng Việt
llm = GeminiLLMClient(settings.GEMINI_API_KEY)
answer = llm.generate(
    query=question,
    context=context
)

print(f"Trả lời: {answer}")

Example 8: Batch Processing với Retry

from vi_rag.models import GeminiEmbeddingModel
from vi_rag.config import settings
import time

embedding_model = GeminiEmbeddingModel(settings.GEMINI_API_KEY)

def embed_with_retry(texts, max_retries=3):
    """Embed với retry logic"""
    for attempt in range(max_retries):
        try:
            return embedding_model.embed_documents(texts)
        except Exception as e:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Retry {attempt + 1}/{max_retries} sau {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise e

# Batch processing
batch_size = 100
all_vectors = []

for i in range(0, len(child_texts), batch_size):
    batch = child_texts[i:i + batch_size]
    vectors = embed_with_retry(batch)
    all_vectors.extend(vectors)
    print(f"Processed {i + len(batch)}/{len(child_texts)}")

📖 Ví Dụ Hoàn Chỉnh

Thư mục examples/ chứa các ví dụ đầy đủ về cách sử dụng Vi-RAG:

Quick Start

python examples/quick_start.py

Complete Workflow

python examples/complete_example.py

Advanced Examples

python examples/advanced_examples.py

Xem chi tiết tại examples/README.md

🏗️ Kiến Trúc Hệ Thống

📊 Key Components

1. Document Loading

PDFLoader: Xử lý PDF với PyPDF hoặc PyMuPDF
TXTLoader: Hỗ trợ nhiều encoding
DOCXLoader: Xử lý Word documents
MD5 Caching: Tự động phát hiện duplicate documents

2. Chunking

HierarchicalChunker: Tạo parent-child chunks
Configurable: Tùy chỉnh size và overlap
Context Preservation: Giữ ngữ cảnh qua overlap

3. Embedding

GeminiEmbeddingModel: Sử dụng Gemini embedding-001
768 dimensions: Tối ưu cho tiếng Việt
Batch processing: Xử lý hàng loạt hiệu quả

4. Vector Storage

QdrantVectorStore: Integration với Qdrant Cloud/Local
COSINE similarity: Đo độ tương đồng ngữ nghĩa
Metadata storage: Lưu trữ thông tin bổ sung

5. Generation

GeminiLLMClient: Multi-model support
PromptBuilder: Template-based prompts
Context-aware: Generate dựa trên retrieved context

🧪 Testing

Run Basic Tests

# Test document loading
python -m testing.code.demo.example_usage

# Test complete workflow
python -m testing.code.demo.complete_example

Run Unit Tests (if available)

pytest tests/

📚 Documentation

QUICKSTART.md: Hướng dẫn bắt đầu nhanh
SYSTEM_LOGIC.md: Kiến trúc chi tiết
EVALUATION.md: Đánh giá với RAGAS
Workflow: Workflow đầy đủ

🔧 Configuration

Environment Variables

Variable	Description	Default
`GEMINI_API_KEY`	Gemini API key	Required
`QDRANT_API_KEY`	Qdrant API key	Required
`QDRANT_URL`	Qdrant server URL	Required
`QDRANT_COLLECTION_NAME`	Collection name	`rag_documents`
`EMBEDDING_DIM`	Embedding dimension	`768`
`QDRANT_VECTOR_DIM`	Vector dimension	`768`
`VECTOR_TOP_K`	Top K results	`5`

Chunking Parameters

DocumentLoader(
    file_path="document.pdf",
    auto_chunk=True,
    parent_size=2000,    # Parent chunk size
    child_size=400,      # Child chunk size
    overlap=50           # Overlap between chunks
)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📧 Contact

Author: Quoc Long
GitHub: NOT-erorr/PBL_2025_Vi-RAG_framework

🙏 Acknowledgments

Google Gemini API for embeddings and generation
Qdrant for vector storage
Contributors and testers

Made with ❤️ for Vietnamese NLP community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.4

Jan 31, 2026

0.1.3

Jan 30, 2026

0.1.2

Jan 30, 2026

0.1.1

Jan 30, 2026

0.1.0

Jan 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vi_rag-0.1.4.tar.gz (28.0 kB view details)

Uploaded Jan 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vi_rag-0.1.4-py3-none-any.whl (27.5 kB view details)

Uploaded Jan 31, 2026 Python 3

File details

Details for the file vi_rag-0.1.4.tar.gz.

File metadata

Download URL: vi_rag-0.1.4.tar.gz
Upload date: Jan 31, 2026
Size: 28.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vi_rag-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`77c5deb4aaeb28d0eac95b93a2413c4a23ca795122e025f583185d5279315182`
MD5	`ccafc409c31dd96118ac849e1a1f0c7d`
BLAKE2b-256	`f933a19211aa74a6f368894228b9c3b9ffc92626f2a36fe40477672375ffbcfb`

See more details on using hashes here.

File details

Details for the file vi_rag-0.1.4-py3-none-any.whl.

File metadata

Download URL: vi_rag-0.1.4-py3-none-any.whl
Upload date: Jan 31, 2026
Size: 27.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vi_rag-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eed5899708902801b8b90502c1c8aef10bbd21024d88cf14b50664d8dd10aa0c`
MD5	`b814ebb8e6bf02cc90b86e2adcb9e35a`
BLAKE2b-256	`c30f31609fa776f2934d117a2eb18c7f6e6e41d99f66fadcdedfcb6cf00f48f7`

See more details on using hashes here.

vi-rag 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Vi-RAG Framework

🌟 Tính Năng Chính

📁 Cấu Trúc Project

🚀 Cài Đặt Nhanh

0. Cài đặt Vi-RAG

1. Clone Repository

2. Tạo Virtual Environment

3. Cấu Hình Environment Variables

💡 Sử Dụng Cơ Bản

Example 1: Load và Chunk Document Tự Động

Example 2: Workflow Hoàn Chỉnh RAG

Example 3: Xử Lý Document Cache

Example 4: Xử Lý Nhiều Documents

Example 5: Load Document Không Auto-Chunk

Example 6: Query với Filtering

Example 7: Multilingual - Tiếng Việt

Example 8: Batch Processing với Retry

📖 Ví Dụ Hoàn Chỉnh

Quick Start

Complete Workflow

Advanced Examples

🏗️ Kiến Trúc Hệ Thống

📊 Key Components

1. Document Loading

2. Chunking

3. Embedding

4. Vector Storage

5. Generation

🧪 Testing

Run Basic Tests

Run Unit Tests (if available)

📚 Documentation

🔧 Configuration

Environment Variables

Chunking Parameters

🤝 Contributing

📧 Contact

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes