Overcoming Group Chat Scenarios with LLM-based Technical Assistance

These details have not been verified by PyPI

Project links

Homepage

Project description

简体中文 | English

"HuixiangDou" is a domain-specific knowledge assistant based on the LLM. Features:

Deal with complex scenarios like group chats, answer user questions without causing message flooding.
Propose an algorithm pipeline for answering technical questions.
Low deployment cost, only need the LLM model to meet 4 traits can answer most of the user's questions, see technical report.

View HuixiangDou inside.

📦 Hardware Requirements

The following are the hardware requirements for running. It is suggested to follow this document, starting with the basic version and gradually experiencing advanced features.

Version	GPU Memory Requirements	Features
Basic Version	20GB	Answer basic domain knowledge questions, zero cost
Advanced Version	40GB	Answer source code level questions, zero cost
Modified Version	4GB	Using openai API, operation involves cost

🔥 Run

We will take lmdeploy & mmpose as examples to explain how to deploy the knowledge assistant to Feishu group chat.

STEP1. Establish Topic Feature Repository

Execute all the commands below (including the '#' symbol).

# Download the repo
git clone https://github.com/internlm/huixiangdou --depth=1 && cd huixiangdou

# Download chatting topics
mkdir repodir
git clone https://github.com/open-mmlab/mmpose --depth=1 repodir/mmpose
git clone https://github.com/internlm/lmdeploy --depth=1 repodir/lmdeploy

# Build a feature store
mkdir workdir # create a working directory
python3 -m pip install -r requirements.txt # install dependencies, python3.11 needs `conda install conda-forge::faiss-gpu`
python3 -m huixiangdou.service.feature_store # save the features of repodir to workdir

The first run will automatically download the configuration of text2vec-large-chinese, you can also manually download it and update model path in config.ini.

After running, HuixiangDou can distinguish which user topics should be dealt with and which chitchats should be rejected. Please edit good_questions and bad_questions, and try your own domain knowledge (medical, finance, electricity, etc.).

# Accept technical topics
process query: Does mmdeploy support mmtrack model conversion now?
process query: Are there any Chinese text to speech models?
# Reject chitchat
reject query: What to eat for lunch today?
reject query: How to make HuixiangDou?

STEP2. Run Basic Technical Assistant

Configure free TOKEN

HuixiangDou uses a search engine. Click Serper to obtain a quota-limited TOKEN and fill it in config.ini.

# config.ini
..
[web_search]
x_api_key = "${YOUR-X-API-KEY}"
..

Test Q&A Effect

Please ensure that the GPU memory is over 20GB (such as 3090 or above). If the memory is low, please modify it according to the FAQ.

The first run will automatically download the configuration of internlm2-7B.

Non-docker users. If you don't use docker environment, you can start all services at once.

# standalone
python3 -m huixiangdou.main --standalone
..
ErrorCode.SUCCESS,
Query: Could you please advise if there is any good optimization method for video stream detection flickering caused by frame skipping?
Reply:
1. Frame rate control and frame skipping strategy are key to optimizing video stream detection performance, but you need to pay attention to the impact of frame skipping on detection results.
2. Multithreading processing and caching mechanism can improve detection efficiency, but you need to pay attention to the stability of detection results.
3. The use of sliding window method can reduce the impact of frame skipping and caching on detection results.

Docker users. If you are using docker, HuixiangDou's Hybrid LLM Service needs to be deployed separately.

# Start LLM service
python3 -m huixiangdou.service.llm_server_hybrid

Open a new terminal, configure the host IP (not container IP) in config.ini, run

# config.ini
[llm]
..
client_url = "http://10.140.24.142:8888/inference" # example

python3 -m huixiangdou.main

STEP3. Integrate into Feishu [Optional]

Click Create a Feishu Custom Robot to get the WEBHOOK_URL callback, and fill it in the config.ini.

# config.ini
..
[frontend]
type = "lark"
webhook_url = "${YOUR-LARK-WEBHOOK-URL}"

Run. After it ends, the technical assistant's reply will be sent to the Feishu group chat.

python3 -m huixiangdou.main --standalone # for non-docker users
python3 -m huixiangdou.main # for docker users

If you still need to read Feishu group messages, see Feishu Developer Square - Add Application Capabilities - Robots.

STEP4. Advanced Version [Optional]

The basic version may not perform well. You can enable these features to enhance performance. The more features you turn on, the better.

Use higher accuracy local LLM

Adjust the llm.local model in config.ini to internlm2-20B. This option has a significant effect, but requires more GPU memory.
Hybrid LLM Service

For LLM services that support the openai interface, HuixiangDou can utilize its Long Context ability. Using kimi as an example, below is an example of config.ini configuration:
```
# config.ini
[llm]
enable_local = 1
enable_remote = 1
..
[llm.server]
..
# open https://platform.moonshot.cn/
remote_type = "kimi"
remote_api_key = "YOUR-KIMI-API-KEY"
remote_llm_max_text_length = 128000
remote_llm_model = "moonshot-v1-128k"
```
We also support chatgpt API. Note that this feature will increase response time and operating costs.

Repo search enhancement

This feature is suitable for handling difficult questions and requires basic development capabilities to adjust the prompt.

Click sourcegraph-account-access to get token

# open https://github.com/sourcegraph/src-cli#installation
sudo curl -L https://sourcegraph.com/.api/src-cli/src_linux_amd64 -o /usr/local/bin/src && chmod +x /usr/local/bin/src

# Enable search and fill the token
[worker]
enable_sg_search = 1
..
[sg_search]
..
src_access_token = "${YOUR_ACCESS_TOKEN}"

Edit the name and introduction of the repo, we take opencompass as an example

# config.ini
# add your repo here, we just take opencompass and lmdeploy as example
[sg_search.opencompass]
github_repo_id = "open-compass/opencompass"
introduction = "Used for evaluating large language models (LLM) .."

Use python3 -m huixiangdou.service.sg_search for unit test, the returned content should include opencompass source code and documentation

python3 -m huixiangdou.service.sg_search
..
"filepath": "opencompass/datasets/longbench/longbench_trivia_qa.py",
"content": "from datasets import Dataset..

Run main.py, HuixiangDou will enable search enhancement when appropriate.

Tune Parameters

It is often unavoidable to adjust parameters with respect to business scenarios.
- Refer to data.json to add real data, run test_intention_prompt.py to get suitable prompts and thresholds, and update them into worker.
- Adjust the number of search results based on the maximum length supported by the model.

🛠️ FAQ

How to access other IMs?
- WeChat. For Enterprise WeChat, see Enterprise WeChat Application Development Guide ; for personal WeChat, we have confirmed with the WeChat team that there is currently no API, you need to search and learn by yourself.
- DingTalk. Refer to DingTalk Open Platform-Custom Robot Access
What if the robot is too cold/too chatty?
- Fill in the questions that should be answered in the real scenario into resource/good_questions.json, and fill the ones that should be rejected into resource/bad_questions.json.
- Adjust the theme content in repodir to ensure that the markdown documents in the main library do not contain irrelevant content.
Re-run service/feature_store.py to update thresholds and feature libraries.
Launch is normal, but out of memory during runtime?

LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such as lmdeploy quantization description. Then use docker to independently deploy Hybrid LLM Service.
How to access other local LLM / After access, the effect is not ideal?
- Open hybrid llm service, add a new LLM inference implementation.
- Refer to test_intention_prompt and test data, adjust prompt and threshold for the new model, and update them into worker.py.
What if the response is too slow/request always fails?
- Refer to hybrid llm service to add exponential backoff and retransmission.
- Replace local LLM with an inference framework such as lmdeploy, instead of the native huggingface/transformers.
What if the GPU memory is too low?

At this time, it is impossible to run local LLM, and only remote LLM can be used in conjunction with text2vec to execute the pipeline. Please make sure that config.ini only uses remote LLM and turn off local LLM.

📝 Citation

@misc{2023HuixiangDou,
    title={HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},
    author={HuixiangDou Contributors},
    howpublished = {\url{https://github.com/internlm/huixiangdou}},
    year={2023}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0rc1 pre-release

Jan 14, 2024

0.1.0rc0 pre-release

Jan 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

huixiangdou-0.1.0rc1.tar.gz (44.0 kB view details)

Uploaded Jan 14, 2024 Source

Built Distribution

huixiangdou-0.1.0rc1-py3-none-any.whl (37.8 kB view details)

Uploaded Jan 14, 2024 Python 3

File details

Details for the file huixiangdou-0.1.0rc1.tar.gz.

File metadata

Download URL: huixiangdou-0.1.0rc1.tar.gz
Upload date: Jan 14, 2024
Size: 44.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for huixiangdou-0.1.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`de93ab3b872d81bbbe2bd9ef0d145117118ba9d524c1b867875bfac16c6e7e69`
MD5	`bb8f18a0646322a1b3fd680f14a3545d`
BLAKE2b-256	`d60a90fd00703584e6cbe0be5244b12f4d31356de9fa6e9bbdb646308b6ccf85`

See more details on using hashes here.

File details

Details for the file huixiangdou-0.1.0rc1-py3-none-any.whl.

File metadata

Download URL: huixiangdou-0.1.0rc1-py3-none-any.whl
Upload date: Jan 14, 2024
Size: 37.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for huixiangdou-0.1.0rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`554f756f4cd89e6a94d67582fd7c1c918bca0391c7d57073e31edafe11b79e74`
MD5	`531b0e8bd270bd94deeb2cde666740fa`
BLAKE2b-256	`61a324483aca29c1bc05559be93f2ae6c8aff5b71a98aa45ca1e0bc1bfde27c8`