Skip to main content

A utility package for RAG operations

Project description

WhiskerRAG

MIT License Python Version PyPI version codecov

WhiskerRAG 是为 PeterCat 和 Whisker 项目开发的 RAG(Retrieval-Augmented Generation)工具包,提供完整的 RAG 相关类型定义和方法实现。

特性

  • 针对通用 RAG 的领域建模类型, 包括任务(Task)、知识(Knowledge)、分段(Chunk)、租户(Tenant)、知识库空间(Space)。
  • Whisker rag 插件接口描述。
  • Github 仓库、S3 资源管理器。

安装

使用 pip 安装:

pip install whiskerrag

快速开始

whiskerrag 包含三个子模块,分别是 whiskerrag_utils、whiskerrag_client、whiskerrag_types。它们分别有不同的用途:

whiskerrag_utils

包含了构建 RAG 系统的常用方法:

from whiskerrag_utils import loader,embedding,retriever

whiskerrag_client

将 RAG 系统服务通过 python sdk 的形式向外暴露。

from whiskerrag_client import APIClient

api_client = APIClient(
    base_url="https://api.example.com",
    token="your_token_here"
)

knowledge_chunks = await api_client.retrieval.retrieve_knowledge_content(
    RetrievalByKnowledgeRequest(knowledge_id="your knowledge uuid here")
)

space_chunks = await api_client.retrieval.retrieve_space_content(
    RetrievalBySpaceRequest(space_id="your space id here ")
)

chunk_list = await api_client.chunk.get_chunk_list(
    page=1,
    size=10,
    filters={"status": "active"}
)

task_list = await api_client.task.get_task_list(
    page=1,
    size=10
)

task_detail = await api_client.task.get_task_detail("task_id_here")

whiskerrag_types

一些辅助开发的类型提示,接口;

from whiskerrag_types.interface import DBPluginInterface, TaskEngineInterface
from whiskerrag_types.model import Knowledge, Task, Tenant, PageParams, PageResponse

开发者指南

环境初始化

  1. 克隆项目
git clone https://github.com/petercat-ai/whiskerrag_toolkit.git
cd whiskerrag_toolkit
  1. 创建并激活虚拟环境
# 查看poetry配置
poetry config --list

# 修改 poetry 配置
poetry config virtualenvs.create true
poetry config virtualenvs.in-project true

poetry env use python3.10

# 激活虚拟环境
source .venv/bin/activate
  1. 安装依赖
# 安装项目依赖
poetry install
# 安装 pre-commit 工具
pre-commit install
  1. 运行测试
# 运行所有测试
poetry run pytest
# 运行指定测试文件
poetry run pytest tests/test_loader.py
  1. poetry 常用命令
# 安装依赖
poetry install

# 添加新依赖
poetry add package_name

# 添加新 dev 依赖
poetry add --dev package_name

# 更新依赖
poetry update

# 查看环境信息
poetry env info

# 查看已安装的包
poetry show

开发工作流

  1. 创建新分支
  2. 开发新功能,补充单元测试,确保代码质量。注意,请确保单元测试覆盖率不低于 80%。
  3. 提交代码,并创建 Pull Request。
  4. 等待代码审查,并根据反馈进行修改。
  5. 合并 Pull Request。

项目结构

whiskerRAG-toolkit/
├── src/
│   ├── whiskerrag_utils/
│   └── whiskerrag_types/
│   └── whiskerrag_client/
└── pyproject.toml

贡献指南

  1. Fork 本仓库
  2. 创建特性分支 (make branch name=feature/amazing-feature)
  3. 提交更改 (git commit -m 'Add some amazing feature')
  4. 推送到分支 (git push origin feature/amazing-feature)
  5. 开启 Pull Request

许可证

本项目采用 MIT 许可证 - 查看 LICENSE 文件了解详情

联系方式

项目维护者 - @petercat-ai

项目链接:https://github.com/petercat-ai/whiskerrag_toolkit

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whiskerrag-0.0.37.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whiskerrag-0.0.37-py3-none-any.whl (40.9 kB view details)

Uploaded Python 3

File details

Details for the file whiskerrag-0.0.37.tar.gz.

File metadata

  • Download URL: whiskerrag-0.0.37.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.10 Linux/6.11.0-1012-azure

File hashes

Hashes for whiskerrag-0.0.37.tar.gz
Algorithm Hash digest
SHA256 5be60603a99fcbc17c707b892c2f785fbbecfcf95635f429508aa042592c7c71
MD5 ed2ad466a9b15cb2e71610cf9ea50822
BLAKE2b-256 f1df84c8b67245ca9c6725072a0254746a048c6a8f8e19602dd6913e2ec7dc99

See more details on using hashes here.

File details

Details for the file whiskerrag-0.0.37-py3-none-any.whl.

File metadata

  • Download URL: whiskerrag-0.0.37-py3-none-any.whl
  • Upload date:
  • Size: 40.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.10 Linux/6.11.0-1012-azure

File hashes

Hashes for whiskerrag-0.0.37-py3-none-any.whl
Algorithm Hash digest
SHA256 6a6a764bb29b218bd94e826b1aa1f7d78e07420fc5b5d3810492fa9f1a387e7a
MD5 65bdb036397705beb3638e09798f642a
BLAKE2b-256 d8694da559b6864cd0d70006bc236157a1266f386d8fb5e93d28378317cea474

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page