An open source chat bot for voice (and multimodal) assistants

These details have not been verified by PyPI

Project links

Project description

achatbot

achatbot factory, create chat bots with llm, asr, tts, vad, etc..

Project Structure

project-structure

Feature

cmd chat bots:
- local-terminal-chat(be/fe)
- remote-queue-chat(be/fe)
- grpc-terminal-chat(be/fe)
- grpc-speaker
- http fastapi_daily_bot_serve (with chat bots pipeline)
support transport connector: pipe(UNIX socket), grpc, queue (redis), (!TODO: websocket, TCP/IP socket)
chat bot processors:
- aggreators(llm use, assistant message),
- ai_frameworks(langchain rag), (!TODO: llamaindex rag)
- realtime voice inference(RTVI),
- transport:
  - webRTC/webSocket: daily, (!TODO: livekit, cloudflare-calls, etc..)
- ai processor: llm, tts, asr etc..
core module:
- local llm: llama-cpp, (!TODO: baby-llm llama2, ~~gundam~~), mlx_lm, transformers etc..)
- api llm: personal-ai(like openai api, other ai provider)
AI modules:
- functions:
  - search: search,search1,serper
  - weather: openweathermap
- speech:
  - asr: sense_voice_asr, whisper_asr, whisper_timestamped_asr, whisper_faster_asr, whisper_transformers_asr, whisper_mlx_asr, lightning_whisper_mlx_asr(!TODO), whisper_groq_asr
  - audio_stream: daily_room_audio_stream(in/out), pyaudio_stream(in/out)
  - detector: porcupine_wakeword,pyannote_vad,webrtc_vad,silero_vad,webrtc_silero_vad
  - player: stream_player
  - recorder: rms_recorder, wakeword_rms_recorder, vad_recorder, wakeword_vad_recorder
  - tts: tts_chat,tts_coqui,tts_cosy_voice,tts_edge,tts_g
  - vad_analyzer: daily_webrtc_vad_analyzer,silero_vad_analyzer
- vision(!TODO)
gen modules config(*.yaml, local/test/prod) from env with file: .env u also use HfArgumentParser this module's args to local cmd parse args
deploy to cloud ☁️ serverless:
- vercel (frontend ui pages)
- Cloudflare(frontend ui pages), personal ai workers
- fastapi-daily-chat-bot on cerebrium (provider aws)
- aws lambda + api Gateway
- etc...

Service Deployment Architecture

UI

vite-react-rtvi-web-voice rtvi web voice chat bots, diff cctv roles etc, u can diy your own role by change the system prompt with DailyRTVIGeneralBot
chat-bot-rtvi-web-sandbox use this web sandbox to test config, actions with DailyRTVIGeneralBot
ui/educator-client deploy it to cloudflare pages, access https://educator-client.pages.dev/
ui/web-client-ui deploy it to cloudflare pages, access https://chat-client-weedge.pages.dev/

Server

fastapi-daily-chat-bot deploy fastapi-daily-chat-bot to cerebrium

Install

[!NOTE] python --version >= 3.10

[!TIP] use uv + pip to run, install the required dependencies fastly, e.g.: uv pip install achatbot uv pip install "achatbot[fastapi_bot_server]"

pypi

python3 -m venv .venv_achatbot
source .venv_achatbot/bin/activate
pip install achatbot
# optional-dependencies e.g.
pip install "achatbot[fastapi_bot_server]"

local

git clone https://github.com/ai-bot-pro/chat-bot.git
cd chat-bot
python3 -m venv .venv_achatbot
source .venv_achatbot/bin/activate
bash scripts/pypi_achatbot.sh dev
# optional-dependencies e.g.
pip install "dist/achatbot-{$version}-py3-none-any.whl[fastapi_bot_server]"

Run chat bots

Run local chat bots

[!NOTE]
run src code, replace achatbot to src, don't need set ACHATBOT_PKG=1 e.g.:
TQDM_DISABLE=True \
     python -m src.cmd.local-terminal-chat.generate_audio2audio > log/std_out.log
PyAudio need install python3-pyaudio e.g. ubuntu apt-get install python3-pyaudio, macos brew install portaudio see: https://pypi.org/project/PyAudio/

llm llama-cpp-python init use cpu Pre-built Wheel to install, if want to use other lib(cuda), see: https://github.com/abetlen/llama-cpp-python#installation-configuration

install pydub need install ffmpeg see: https://www.ffmpeg.org/download.html

run pip install "achatbot[local_terminal_chat_bot]" to install dependencies to run local terminal chat bot;
create achatbot data dir in $HOME dir mkdir -p ~/.achatbot/{log,config,models,records,videos};
cp .env.example .env, and check .env, add key/value env params;

select a model ckpt to download:

vad model ckpt (default vad ckpt model use silero vad)

# vad pyannote segmentation ckpt
huggingface-cli download pyannote/segmentation-3.0  --local-dir ~/.achatbot/models/pyannote/segmentation-3.0 --local-dir-use-symlinks False

asr model ckpt (default whipser ckpt model use base size)

# asr openai whisper ckpt
wget https://openaipublic.azureedge.net/main/whisper/models/ed3a0b6b1c0edf879ad9b11b1af5a0e6ab5db9205f891f668f8b0e6c6326e34e/base.pt -O ~/.achatbot/models/base.pt

# asr hf openai whisper ckpt for transformers pipeline to load
huggingface-cli download openai/whisper-base  --local-dir ~/.achatbot/models/openai/whisper-base --local-dir-use-symlinks False

# asr hf faster whisper (CTranslate2)
huggingface-cli download Systran/faster-whisper-base  --local-dir ~/.achatbot/models/Systran/faster-whisper-base --local-dir-use-symlinks False

# asr SenseVoice ckpt
huggingface-cli download FunAudioLLM/SenseVoiceSmall  --local-dir ~/.achatbot/models/FunAudioLLM/SenseVoiceSmall --local-dir-use-symlinks False

llm model ckpt (default llamacpp ckpt(ggml) model use qwen-2 instruct 1.5B size)

# llm llamacpp Qwen2-Instruct
huggingface-cli download Qwen/Qwen2-1.5B-Instruct-GGUF qwen2-1_5b-instruct-q8_0.gguf  --local-dir ~/.achatbot/models --local-dir-use-symlinks False

# llm llamacpp Qwen1.5-chat
huggingface-cli download Qwen/Qwen1.5-7B-Chat-GGUF qwen1_5-7b-chat-q8_0.gguf  --local-dir ~/.achatbot/models --local-dir-use-symlinks False

# llm llamacpp phi-3-mini-4k-instruct
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-gguf Phi-3-mini-4k-instruct-q4.gguf --local-dir ~/.achatbot/models --local-dir-use-symlinks False

tts model ckpt (default whipser ckpt model use base size)

# tts chatTTS
huggingface-cli download 2Noise/ChatTTS  --local-dir ~/.achatbot/models/2Noise/ChatTTS --local-dir-use-symlinks False

# tts coquiTTS
huggingface-cli download coqui/XTTS-v2  --local-dir ~/.achatbot/models/coqui/XTTS-v2 --local-dir-use-symlinks False

# tts cosy voice
git lfs install
git clone https://www.modelscope.cn/iic/CosyVoice-300M.git ~/.achatbot/models/CosyVoice-300M
git clone https://www.modelscope.cn/iic/CosyVoice-300M-SFT.git ~/.achatbot/models/CosyVoice-300M-SFT
git clone https://www.modelscope.cn/iic/CosyVoice-300M-Instruct.git ~/.achatbot/models/CosyVoice-300M-Instruct
#git clone https://www.modelscope.cn/iic/CosyVoice-ttsfrd.git ~/.achatbot/models/CosyVoice-ttsfrd

run local terminal chat bot with env; e.g.

use dufault env params to run local chat bot

ACHATBOT_PKG=1 TQDM_DISABLE=True \
    python -m achatbot.cmd.local-terminal-chat.generate_audio2audio > ~/.achatbot/log/std_out.log

Run remote http fastapi daily chat bots

run pip install "achatbot[fastapi_daily_bot_server]" to install dependencies to run http fastapi daily chat bot;

run below cmd to start http server, see api docs: http://0.0.0.0:4321/docs

ACHATBOT_PKG=1 python -m achatbot.cmd.http.server.fastapi_daily_bot_serve

run chat bot processor, e.g.

run a daily langchain rag bot api, with ui/educator-client

[!NOTE] need process youtube audio save to local file with pytube, run pip install "achatbot[pytube,deep_translator]" to install dependencies and transcribe/translate to text, then chunks to vector store, and run langchain rag bot api; run data process:
ACHATBOT_PKG=1 python -m achatbot.cmd.bots.rag.data_process.youtube_audio_transcribe_to_tidb
or download processed data from hf dataset weege007/youtube_videos, then chunks to vector store .

curl -XPOST "http://0.0.0.0:4321/bot_join/chat-bot/DailyLangchainRAGBot" \
 -H "Content-Type: application/json" \
 -d $'{"config":{"llm":{"model":"llama-3.1-70b-versatile","messages":[{"role":"system","content":""}],"language":"zh"},"tts":{"tag":"cartesia_tts_processor","args":{"voice_id":"eda5bbff-1ff1-4886-8ef1-4e69a77640a0","language":"zh"}},"asr":{"tag":"deepgram_asr_processor","args":{"language":"zh","model":"nova-2"}}}}' | jq .

run a simple daily chat bot api, with ui/web-client-ui (default language: zh)

curl -XPOST "http://0.0.0.0:4321/bot_join/DailyBot" \
 -H "Content-Type: application/json" \
 -d '{}' | jq .

Run remote rpc chat bot worker

run pip install "achatbot[remote_rpc_chat_bot_be_worker]" to install dependencies to run rpc chat bot BE worker; e.g. :
- use dufault env params to run rpc chat bot BE worker

ACHATBOT_PKG=1 RUN_OP=be TQDM_DISABLE=True \
    TTS_TAG=tts_edge \
    python -m achatbot.cmd.grpc.terminal-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log

run pip install "achatbot[remote_rpc_chat_bot_fe]" to install dependencies to run rpc chat bot FE;

ACHATBOT_PKG=1 RUN_OP=fe \
    TTS_TAG=tts_edge \
    python -m achatbot.cmd.grpc.terminal-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log

Run remote queue chat bot worker

run pip install "achatbot[remote_queue_chat_bot_be_worker]" to install dependencies to run queue chat bot worker; e.g.:

use default env params to run

ACHATBOT_PKG=1 REDIS_PASSWORD=$redis_pwd RUN_OP=be TQDM_DISABLE=True \
    python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log

sense_voice(asr) -> qwen (llm) -> cosy_voice (tts) u can login redislabs create 30M free databases; set REDIS_HOST,REDIS_PORT and REDIS_PASSWORD to run, e.g.:

 ACHATBOT_PKG=1 RUN_OP=be \
   TQDM_DISABLE=True \
   REDIS_PASSWORD=$redis_pwd \
   REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \
   REDIS_PORT=14241 \
   ASR_TAG=sense_voice_asr \
   ASR_LANG=zn \
   ASR_MODEL_NAME_OR_PATH=~/.achatbot/models/FunAudioLLM/SenseVoiceSmall \
   N_GPU_LAYERS=33 FLASH_ATTN=1 \
   LLM_MODEL_NAME=qwen \
   LLM_MODEL_PATH=~/.achatbot/models/qwen1_5-7b-chat-q8_0.gguf \
   TTS_TAG=tts_cosy_voice \
   python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log

run pip install "achatbot[remote_queue_chat_bot_fe]" to install the required packages to run quueue chat bot frontend; e.g.:

use default env params to run (default vad_recorder)

ACHATBOT_PKG=1 RUN_OP=fe \
    REDIS_PASSWORD=$redis_pwd \
    REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \
    REDIS_PORT=14241 \
    python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log

with wake word

ACHATBOT_PKG=1 RUN_OP=fe \
    REDIS_PASSWORD=$redis_pwd \
    REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \
    REDIS_PORT=14241 \
    RECORDER_TAG=wakeword_rms_recorder \
    python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log

default pyaudio player stream with tts tag out sample info(rate,channels..), e.g.: (be use tts_cosy_voice out stream info)

 ACHATBOT_PKG=1 RUN_OP=fe \
     REDIS_PASSWORD=$redis_pwd \
     REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \
     REDIS_PORT=14241 \
     RUN_OP=fe \
     TTS_TAG=tts_cosy_voice \
     python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log

remote_queue_chat_bot_be_worker in colab examples :

sense_voice(asr) -> qwen (llm) -> cosy_voice (tts):

Run remote grpc tts speaker bot

run pip install "achatbot[remote_grpc_tts_server]" to install dependencies to run grpc tts speaker bot server;

ACHATBOT_PKG=1 python -m achatbot.cmd.grpc.speaker.server.serve

run pip install "achatbot[remote_grpc_tts_client]" to install dependencies to run grpc tts speaker bot client;

ACHATBOT_PKG=1 TTS_TAG=tts_edge IS_RELOAD=1 python -m achatbot.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_g IS_RELOAD=1 python -m achatbot.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_coqui IS_RELOAD=1 python -m achatbot.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_chat IS_RELOAD=1 python -m achatbot.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_cosy_voice IS_RELOAD=1 python -m achatbot.cmd.grpc.speaker.client

Multimodal Interaction

audio (voice)

stream-stt (realtime-recorder)
audio-llm (multimode-chat)
stream-tts (realtime-(clone)-speaker)

vision (CV)

stream-ocr (realtime-object-dectection)

Embodied Intelligence: Robots that touch the world, perceive and move

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.7.2

Oct 13, 2024

0.0.7.1

Sep 27, 2024

0.0.7.0

Sep 19, 2024

0.0.6.12

Sep 14, 2024

0.0.6.11

Sep 10, 2024

0.0.6.10

Sep 10, 2024

0.0.6.9

Sep 10, 2024

0.0.6.8

Sep 9, 2024

This version

0.0.6.7

Sep 8, 2024

0.0.6.6

Sep 7, 2024

0.0.6.5

Sep 6, 2024

0.0.6.4

Sep 6, 2024

0.0.6.3

Sep 6, 2024

0.0.6.2

Sep 4, 2024

0.0.6.1

Sep 2, 2024

0.0.6

Sep 1, 2024

0.0.5.28

Aug 26, 2024

0.0.5.27

Aug 25, 2024

0.0.5.26

Aug 25, 2024

0.0.5.25

Aug 25, 2024

0.0.5.24

Aug 25, 2024

0.0.5.23

Aug 25, 2024

0.0.5.22

Aug 25, 2024

0.0.5.21

Aug 25, 2024

0.0.5.20

Aug 21, 2024

0.0.5.19

Aug 21, 2024

0.0.5.18

Aug 21, 2024

0.0.5.17

Aug 21, 2024

0.0.5.16

Aug 21, 2024

0.0.5.15

Aug 20, 2024

0.0.5.14

Aug 20, 2024

0.0.5.13

Aug 20, 2024

0.0.5.12

Aug 20, 2024

0.0.5.11

Aug 20, 2024

0.0.5.10

Aug 20, 2024

0.0.5.9

Aug 19, 2024

0.0.5.8

Aug 19, 2024

0.0.5.7

Aug 19, 2024

0.0.5.6

Aug 17, 2024

0.0.5.5

Aug 17, 2024

0.0.5.4

Aug 16, 2024

0.0.5.3

Aug 16, 2024

0.0.5.2

Aug 15, 2024

0.0.5.1

Aug 15, 2024

0.0.4

Aug 14, 2024

0.0.0

Aug 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

achatbot-0.0.6.7.tar.gz (351.9 kB view hashes)

Uploaded Sep 8, 2024 Source

Built Distribution

achatbot-0.0.6.7-py3-none-any.whl (493.6 kB view hashes)

Uploaded Sep 8, 2024 Python 3

Hashes for achatbot-0.0.6.7.tar.gz

Hashes for achatbot-0.0.6.7.tar.gz
Algorithm	Hash digest
SHA256	`8e0a48a48bc10dc60edc9db8c8ea91a2deec1e51e454fa675a09cdac1ba3903c`
MD5	`52bbcc77f29b1530dcc0bf76f4ce18e6`
BLAKE2b-256	`daaf812663b96a73ffe135abd2c2b365df21712fad19989efd4610f5df3fa493`

Hashes for achatbot-0.0.6.7-py3-none-any.whl

Hashes for achatbot-0.0.6.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`80c7aae298d6ab69575be37ef86a9d1a272e4ce0cdad67edf3d130dbbfc01736`
MD5	`baa7c1a7182adf43d15aff8a02c9c1e2`
BLAKE2b-256	`a429d997eb4fd350e0890a48e473aa29553ec9ac8d71dcd30345b531e38f86cf`