Skip to main content

(v0.1.0: 异步化三大常用函数,增加三条数据结构常有函数)去除了HippoRAG2中的torch,vllm,甚至openai;完全由siliconflow api和本地cpu实现功能。

Project description

HippoRAG 精简版

鉴于许多应用需求轻量级模块,同时api+cpu能够取得不错的效果,特此对hipporag项目进行了一些修改。同时对中文社区(siliconflow api)进行了深入的支持。 尽管相当不完善,但依然具有一定的可用性。

  • v0.0.2更新,出于模块化考虑,我们去除了对环境变量的依赖,而是直接作为参数显式传入即可
  • v0.0.3更新,汉化了提示词
  • v0.1.0更新,异步化三大常用函数(index,delete,retrieve),新增三大数据结构常用函数(save,size,clear)(异步),还定义了迭代器(输出已存储的文档与其哈希值的字典)

快速上手

conda create -n hipporag python=3.10

conda activate hipporag

pip install hipporag-lite

示例:

import multiprocessing
import asyncio

# 定义一个异步主函数来处理所有操作
async def main():
    from hipporag_lite import HippoRAG

    # 准备数据集
    docs = [
        "Oliver Badman is a politician.",
        "George Rankin is a politician.",
        "Thomas Marwick is a politician.",
        "Cinderella attended the royal ball.",
        "The prince used the lost glass slipper to search the kingdom.",
        "When the slipper fit perfectly, Cinderella was reunited with the prince.",
        "Erik Hort's birthplace is Montebello.",
        "Marina is bom in Minsk.",
        "Montebello is a part of Rockland County."
    ]

    save_dir = 'outputs'
    llm_model_name = 'Pro/deepseek-ai/DeepSeek-V3'
    embedding_model_name = 'Qwen/Qwen3-Embedding-8B'
    llm_base_url = 'https://api.siliconflow.cn/v1/chat/completions'
    embedding_base_url = 'https://api.siliconflow.cn/v1/embeddings'

    try:
        hipporag = HippoRAG(
            api_key="Bearer sk-...", # 你的siliconflow api_key
            save_dir=save_dir,
            llm_model_name=llm_model_name,
            embedding_model_name=embedding_model_name,
            llm_base_url=llm_base_url,
            embedding_base_url=embedding_base_url
        )
        print("HippoRAG实例创建成功")
        print(f"初始索引大小: {hipporag.size()} 文档")
    except Exception as e:
        print(f"创建HippoRAG实例失败: {e}")
        return

    # 异步处理索引操作
    try:
        await hipporag.index(docs=docs)  # 使用await调用异步方法
        print(f"索引操作完成,当前大小: {hipporag.size()} 文档")
    except Exception as e:
        print(f"索引失败: {e}")

    try:
        await hipporag.save()  # 异步保存
        print("系统状态保存成功")
    except Exception as e:
        print(f"保存失败: {e}")

    # 处理查询
    queries = [
        "What is George Rankin's occupation?",
        "How did Cinderella reach her happy ending?",
        "What county is Erik Hort's birthplace a part of?"
    ]

    try:
        retrieval_results = await hipporag.retrieve(queries=queries, num_to_retrieve=2)
        print(f"检索完成: 共处理 {len(retrieval_results)} 个查询")
    except Exception as e:
        print(f"检索失败: {e}")

    # 删除文档
    docs_to_delete = [
        "Oliver Badman is a politician.",
        "Thomas Marwick is a politician."
    ]
    
    try:
        print(f"删除前索引大小: {hipporag.size()} 文档")
        await hipporag.delete(docs_to_delete=docs_to_delete)  # 异步删除
        print(f"删除完成: 移除了 {len(docs_to_delete)} 个文档,当前大小: {hipporag.size()} 文档")
    except Exception as e:
        print(f"删除失败: {e}")

    # 输出文档(迭代器)
    try: 
        for text, hash_id in hipporag:
            print(text)
        print(f"文档输出完毕")
    except Exception as e:
        print(f"输出失败: {e}")

    # 清空系统
    try:
        print(f"清空前索引大小: {hipporag.size()} 文档")
        await hipporag.clear()  # 异步清空
        print(f"系统已清空,当前大小: {hipporag.size()} 文档")
    except Exception as e:
        print(f"清空失败: {e}")
    
    # 验证状态
    try:
        if hipporag.size() == 0:
            print("验证: 文档存储已清空")
        else:
            print(f"警告: 清空后仍有 {hipporag.size()} 个文档")
            
        if hipporag.graph.vcount() == 0:
            print("验证: 知识图谱已重置")
    except Exception as e:
        print(f"状态验证失败: {e}")

    print("所有操作完成")

if __name__ == '__main__':
    multiprocessing.freeze_support()
    # 在主线程中运行异步主函数
    asyncio.run(main())

原项目主页:https://github.com/OSU-NLP-Group/HippoRAG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hipporag_lite-0.1.0rc1.tar.gz (65.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hipporag_lite-0.1.0rc1-py3-none-any.whl (80.2 kB view details)

Uploaded Python 3

File details

Details for the file hipporag_lite-0.1.0rc1.tar.gz.

File metadata

  • Download URL: hipporag_lite-0.1.0rc1.tar.gz
  • Upload date:
  • Size: 65.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.18 Linux/6.11.0-1018-azure

File hashes

Hashes for hipporag_lite-0.1.0rc1.tar.gz
Algorithm Hash digest
SHA256 c71afabe686be79f0ce19fe2c9e15884e64bd743c19d3494ffdf496f5cd7dd9d
MD5 0dc5a95ef2275a27503144ca944240e9
BLAKE2b-256 5b1c94ac66753aa2890f8cb739f81678fd90ae65019468c96a6a3d8749fd5aee

See more details on using hashes here.

File details

Details for the file hipporag_lite-0.1.0rc1-py3-none-any.whl.

File metadata

  • Download URL: hipporag_lite-0.1.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 80.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.18 Linux/6.11.0-1018-azure

File hashes

Hashes for hipporag_lite-0.1.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 c308f7893de7dfec18015a44ae6d104430d17a3f70ced81ee674b7980dcc38ec
MD5 e9c061bcd48f5c9a4a664a33e14317be
BLAKE2b-256 2b16c0051d29720789dfee252e9383b9b4493054e0dbcd0feec1c4f227c83abb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page