Skip to main content

Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Project description

🎚️ Upgrade

HuixiangDou2(ACL25) is a GraphRAG solution whose effectiveness has been demonstrated in the plant-science domain and that contributed to the cover paper in Cell Molecular Plant. If you work outside computer science, give the new release a try.


English | 简体中文

HuixiangDou1 is a professional knowledge assistant based on LLM.

Advantages:

  1. Design three-stage pipelines of preprocess, rejection and response
  2. No training required, with CPU-only, 2G, 10G configuration
  3. Offers a complete suite of Web, Android, and pipeline source code, industrial-grade and commercially viable

Check out the scenes in which HuixiangDou are running and current public service status:

  • readthedocs ChatWithAI (cpu-only) is available
  • OpenXLab is using GPU and under continuous maintenance
  • WeChat bot has a cost associated with WeChat integration. All code has been verified to be functional for one year. Please deploy it on your own for either the free or commercial version.

If this helps you, please give it a star ⭐

🔆 New Features

Our Web version has been released to OpenXLab, where you can create knowledge base, update positive and negative examples, turn on web search, test chat, and integrate into Feishu/WeChat groups. See BiliBili and YouTube !

The Web version's API for Android also supports other devices. See Python sample code.

📖 Support Status

LLM File Format Retrieval Method Integration Preprocessing
  • excel
  • html
  • markdown
  • pdf
  • ppt
  • txt
  • word

📦 Hardware Requirements

The following are the GPU memory requirements for different features, the difference lies only in whether the options are turned on.

Configuration Example GPU mem Requirements Description Verified on Linux
config-cpu.ini - Use siliconcloud API
for text only
[Standard Edition]config.ini 2GB Use openai API (such as kimi, deepseek and stepfun to search for text only
config-multimodal.ini 10GB Use openai API for LLM, image and text retrieval

🔥 Running the Standard Edition

We take the standard edition (local running LLM, text retrieval) as an introduction example. Other versions are just different in configuration options.

I. Download and install dependencies

Click to agree to the BCE model agreement, log in huggingface

huggingface-cli login

Install dependencies

# parsing `word` format requirements
apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
# python requirements
pip install -r requirements.txt
# For python3.8, install faiss-gpu instead of faiss

II. Create knowledge base

We use some novels to build knowledge base and filtering questions. If you have your own documents, just put them under repodir.

Copy and execute all the following commands (including the '#' symbol).

# Download the knowledge base, we only take the some documents as example. You can put any of your own documents under `repodir`
cd HuixiangDou
mkdir repodir
cp -rf resource/data* repodir/

# Build knowledge base, this will save the features of repodir to workdir, and update the positive and negative example thresholds into `config.ini`
mkdir workdir
python3 -m huixiangdou.services.store

# You can also build knowledge base from QA pairs (CSV or JSON format)
# CSV: First column is key (question), second column is value (answer)
# JSON: {"question1": "answer1", "question2": "answer2", ...}
# python3 -m huixiangdou.services.store --qa-pair resource/data/qa_pair.csv

III. Setup LLM API and test

Set the model and api-key in config.ini. If running LLM locally, we recommend using vllm.

vllm serve /path/to/Qwen-2.5-7B-Instruct --served-model-name vllm --enable-prefix-caching --served-model-name Qwen-2.5-7B-Instruct

Here is an example of the configured config.ini:

[llm.server]
remote_type = "kimi"
remote_api_key = "sk-dp3GriuhhLXnYo0KUuWbFUWWKOXXXXXXXXXX"
remote_llm_model = "auto"

# remote_type = "step"
# remote_api_key = "5CpPyYNPhQMkIzs5SYfcdbTHXq3a72H5XXXXXXXXXXXXX"
# remote_llm_model = "auto"

# remote_type = "deepseek"
# remote_api_key = "sk-86db9a205aa9422XXXXXXXXXXXXXX"
# remote_llm_model = "deepseek-chat"

# remote_type = "vllm"
# remote_api_key = "EMPTY"
# remote_llm_model = "Qwen2.5-7B-Instruct"

# remote_type = "siliconcloud"
# remote_api_key = "sk-xxxxxxxxxxxxx"
# remote_llm_model = "alibaba/Qwen1.5-110B-Chat"

# remote_type = "ppio"
# remote_api_key = "sk-xxxxxxxxxxxxx"
# remote_llm_model = "thudm/glm-4-9b-chat"

Then run the test:

# Respond to questions related to the Hundred-Plant Garden (related to the knowledge base), but do not respond to weather questions.
python3 -m huixiangdou.main

+-----------------------+---------+--------------------------------+-----------------+
|         Query         |  State  |         Reply                  |   References    |
+=======================+=========+================================+=================+
| What is in the Hundred-Plant Garden? | success | The Hundred-Plant Garden has a rich variety of natural landscapes and life... | installation.md |
--------------------------------------------------------------------------------------
| How is the weather today?         | Init state| ..                           |                 |
+-----------------------+---------+--------------------------------+-----------------+
🔆 Input your question here, type `bye` for exit:
..

💡 Also run a simple Web UI with gradio:

python3 -m huixiangdou.gradio_ui

Or run a server to listen 23333, default pipeline is chat_with_repo:

python3 -m huixiangdou.api_server

# test async API 
curl -X POST http://127.0.0.1:23333/huixiangdou_stream  -H "Content-Type: application/json" -d '{"text": "how to install mmpose","image": ""}'
# cURL sync API
curl -X POST http://127.0.0.1:23333/huixiangdou_inference  -H "Content-Type: application/json" -d '{"text": "how to install mmpose","image": ""}'

Please update the repodir documents, good_questions and bad_questions, and try your own domain knowledge (medical, financial, power, etc.).

IV. Integration

To Feishu, WeChat group

To web front and backend

We provide typescript front-end and python back-end source code:

  • Multi-tenant management supported
  • Zero programming access to Feishu and WeChat
  • k8s friendly

Same as OpenXlab APP, please read the web deployment document.

To readthedocs.io

Try right-bottom button on the page and document.

🍴 Other Configurations

CPU-only Edition

If there is no GPU available, model inference can be completed using the siliconcloud API.

Taking docker miniconda+Python3.11 as an example, install CPU dependencies and run:

# Start container
docker run -v /path/to/huixiangdou:/huixiangdou -p 7860:7860 -p 23333:23333 -it continuumio/miniconda3 /bin/bash
# Install dependencies
apt update
apt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev
python3 -m pip install -r requirements-cpu.txt
# Establish knowledge base
python3 -m huixiangdou.services.store --config_path config-cpu.ini
# Q&A test
python3 -m huixiangdou.main --config_path config-cpu.ini
# gradio UI
python3 -m huixiangdou.gradio_ui --config_path config-cpu.ini

If you find the installation too slow, a pre-installed image is provided in Docker Hub. Simply replace it when starting the docker.

10G Multimodal Edition

If you have 10G GPU mem, you can further support image and text retrieval. Just modify the model used in config.ini.

# config-multimodal.ini
# !!! Download `https://huggingface.co/BAAI/bge-visualized/blob/main/Visualized_m3.pth`    to `bge-m3` folder !!!
embedding_model_path = "BAAI/bge-m3"
reranker_model_path = "BAAI/bge-reranker-v2-minicpm-layerwise"

Note:

Run gradio to test, see the image and text retrieval result here.

python3 tests/test_query_gradio.py

Furthermore

Please read the following topics:

🛠️ FAQ

  1. What if the robot is too cold/too chatty?

    • Fill in the questions that should be answered in the real scenario into resource/good_questions.json, and fill the ones that should be rejected into resource/bad_questions.json.
    • Adjust the theme content in repodir to ensure that the markdown documents in the main library do not contain irrelevant content.

    Re-run feature_store to update thresholds and feature libraries.

    ⚠️ You can directly modify reject_throttle in config.ini. Generally speaking, 0.5 is a high value; 0.2 is too low.

  2. Launch is normal, but out of memory during runtime?

    LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such as lmdeploy quantization description. Then use docker to independently deploy Hybrid LLM Service.

  3. No module named 'faiss.swigfaiss_avx2'

    locate installed faiss package

    import faiss
    print(faiss.__file__)
    # /root/.conda/envs/InternLM2_Huixiangdou/lib/python3.10/site-packages/faiss/__init__.py
    

    add soft link

    # cd your_python_path/site-packages/faiss
    cd /root/.conda/envs/InternLM2_Huixiangdou/lib/python3.10/site-packages/faiss/
    ln -s swigfaiss.py swigfaiss_avx2.py
    

🍀 Acknowledgements

📝 Citation

@misc{kong2024huixiangdou,
      title={HuiXiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},
      author={Huanjun Kong and Songyang Zhang and Jiaying Li and Min Xiao and Jun Xu and Kai Chen},
      year={2024},
      eprint={2401.08772},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{kong2024labelingsupervisedfinetuningdata,
      title={Labeling supervised fine-tuning data with the scaling law}, 
      author={Huanjun Kong},
      year={2024},
      eprint={2405.02817},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2405.02817}, 
}

@misc{kong2025huixiangdou2robustlyoptimizedgraphrag,
      title={HuixiangDou2: A Robustly Optimized GraphRAG Approach}, 
      author={Huanjun Kong and Zhefan Wang and Chenyang Wang and Zhe Ma and Nanqing Dong},
      year={2025},
      eprint={2503.06474},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2503.06474}, 
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

huixiangdou-20240415.tar.gz (153.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

huixiangdou-20240415-py3-none-any.whl (183.1 kB view details)

Uploaded Python 3

File details

Details for the file huixiangdou-20240415.tar.gz.

File metadata

  • Download URL: huixiangdou-20240415.tar.gz
  • Upload date:
  • Size: 153.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for huixiangdou-20240415.tar.gz
Algorithm Hash digest
SHA256 a252d5f15caa707410f40f6dae9622b46be51c9a3fe71f8c4a9030328bdecf3d
MD5 7b784e13669725bc950ceceff29bf41d
BLAKE2b-256 1520f17ec4b7c28b0017855faee067a634c0f5d42f3b5d5f095e85ef4f9491b3

See more details on using hashes here.

File details

Details for the file huixiangdou-20240415-py3-none-any.whl.

File metadata

  • Download URL: huixiangdou-20240415-py3-none-any.whl
  • Upload date:
  • Size: 183.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for huixiangdou-20240415-py3-none-any.whl
Algorithm Hash digest
SHA256 999384a49c9808056816a143f1dc44b1d1b40b1dc4e021dbf330fbd4e1769524
MD5 195049f9bf77c1a31d8c75d62523a6f5
BLAKE2b-256 b32c1aaeef90570336cc05b4e26dedb655e83e56ba3b0bc2f00179a21a121c69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page