Core library for home smart speakers based on large language models (LLM).

These details have not been verified by PyPI

Project links

Project description

Family AI Voice Assistant Core

Project Overview

Family AI Voice Assistant Core is a core library for home smart speakers based on large language models (LLM). This project provides a modular solution to support the development and deployment of home voice assistants. It offers a runtime framework, module interface definitions, and features like Tools Manager, logging, and Telemetry.

Design

The project adopts a modular design aimed at achieving flexible functionality expansion and maintenance. By defining clear interfaces, each module can be developed independently and dynamically loaded and bound through configuration files. The core library coordinates the interaction of each module to ensure stable system operation.

Main Architecture

Module Overview

The clients directory defines abstract interfaces for various functional modules of the smart speaker:

AssistantClient: Core control logic of the smart speaker, coordinating and invoking various modules.
WakerClient: Wake-up detection interface for activating the voice assistant.
GreetingClient: Responsible for generating greetings.
ListeningClient: Listens to user voice input.
RecognitionClient: Speech recognition interface that converts speech to text.
PlaySoundClient: Plays audio files.
LLMClient: Handles natural language generation tasks.
ChatSessionClient: Manages conversation sessions with users, maintaining dialogue state.
SpeechClient: Text-to-speech interface for voice output.
ClientManager: Client manager responsible for registering and retrieving module instances.
HistoryStoreClient: Manages the storage of conversation history.
FileStoreClient: File storage interface for storing audio files to specified locations.

alt text

Other Features

Logging

Provides a unified logging function for easy debugging and system monitoring.

Telemetry

Responsible for monitoring system performance and usage, collecting and analyzing key metrics.

Tools Engine

Manages and registers Tools Calling for LLM, supporting dynamic expansion and invocation.

Configs

Configuration management module that loads and parses configuration information from files, supporting dynamic configuration of clients and functional modules.

Usage Guide

Environment Preparation

Python: Ensure Python 3.9 or above is installed, or create a Python environment using conda.
Install package:

pip install family-ai-voice-assistant-core

Example Code

You can directly use family-ai-voice-assistant-impl, which provides common implementations for each module, or customize your own implementation by referring to this project.

Steps:

Implement the following interfaces:
- VoiceWaker (optional, built-in keyboard wake-up and interactive enter wake-up)
- PlaySoundClient
- RecognitionClient
- LLMClient
- SpeechClient

Example:

from dataclasses import dataclass

from family_ai_voice_assistant.core.clients import VoiceWaker
from family_ai_voice_assistant.core.configs import ConfigManager
from family_ai_voice_assistant.core.configs import Config

@dataclass
class MyWakerConfig(Config):
    api_key: str = None

class MyWaker(VoiceWaker):

    def __init__(self):
        config = ConfigManager().get_instance(MyWakerConfig)
        if config is None:
            raise ValueError("MyWakerConfig is not set.")
        waker = WakerAPI(config.api_key)

    def check(self) -> bool:
        return waker.wake()

[optional] For each newly implemented client, define the corresponding config type. Refer to existing config types /configs. Each config type corresponds to a section of the same name in config.yaml during parsing.
Provide a config.yaml file containing the necessary information for running each module and set the path for config.yaml.

config.yaml

# Other sections

mywaker:
  api_key: xxxxxx

# Other sections

main.py

from family_ai_voice_assistant.core import set_yaml_config_path

set_yaml_config_path("config.yaml")

Use ClientSelector to bind client types and config types. If no config is needed, map to None.

from family_ai_voice_assistant.core.client_register import (
    ClientSelector
)

ClientSelector().map_play_sound_config(None, MyPlaySound)
ClientSelector().map_voice_waker_config(MyVoiceWakerConfig, MyVoiceWaker)
ClientSelector().map_recognition_config(MyRecognitionConfig, MyRecognition) 
ClientSelector().map_llm_config(MyLLMConfig, MyLLM)
ClientSelector().map_speech_config(MySpeechConfig, MySpeech)

Use ClientRegistor to identify which clients need to be instantiated and registered to ClientManager by reading config.yaml.

from family_ai_voice_assistant.core.client_register import (
    ClientRegistor,
    ClientSelector
)

ClientRegistor().register_clients_from_selector()

Run the assistant. The assistant acts as an orchestrator, retrieving each module's instance from ClientManager at runtime and invoking the corresponding interfaces.

assistant = ClientRegistor().get_assistant()
assistant.run()

Below is a simple startup example:

main.py

import argparse
from family_ai_voice_assistant.core import set_yaml_config_path
from family_ai_voice_assistant.core.client_register import (
    ClientRegistor,
    ClientSelector
)

parser = argparse.ArgumentParser(description="Start the Family AI Assistant.")
parser.add_argument('config', type=str, help='the config file path')
args = parser.parse_args()

set_yaml_config_path(args.config)

def map_configs_to_clients():
    ClientSelector().map_play_sound_config(None, MyPlaySound)
    ClientSelector().map_voice_waker_config(MyVoiceWakerConfig, MyVoiceWaker)
    ClientSelector().map_recognition_config(MyRecognitionConfig, MyRecognition) 
    ClientSelector().map_llm_config(MyLLMConfig, MyLLM)
    ClientSelector().map_speech_config(MySpeechConfig, MySpeech)

def main():
    map_configs_to_clients()
    ClientRegistor().register_clients_from_selector()
    assistant = ClientRegistor().get_assistant()
    assistant.run()

if __name__ == "__main__":
    main()

Execute:

python main.py <path to config.yaml>

For a real example, refer to the entry code of family-ai-voice-assistant-impl basic_entry.py.

Other Features

File Server

The command family_ai_voice_assistant_file_server can start a service to receive uploaded files, suitable for deployment on an internal network storage server such as NAS, to receive and store audio files. It can be used with the built-in RestFileStore.

Example:

NAS side, IP: 192.168.1.200

pip install family-ai-voice-assistant-core

family_ai_voice_assistant_file_server --root /home/username/data/assistant --port 5100 &

Assistant side

config.yaml

filestore:
  destination: http://192.168.1.200:5100/files/upload

Assistant API

Two implementations of Assistant are built-in. The default is BasicAssistant. AssistantWithApi extends BasicAssistant by providing an API that allows chatting with the Assistant, making it easy to embed the Assistant into other applications.

Example:

Assistant side, IP: 192.168.1.240

config.yaml

assistantapi:
  port: 10000 # any available port

On the client side, call this API according to the protocol: chat_request.py,

curl -X POST -H "Content-Type: application/json" \
     -d '{"question": "Why is the sky blue?", "speak_answer": false}'  \
     http://192.168.1.240:10000/chat

Or use the built-in interactive command-line client:

pip install family-ai-voice-assistant-core

family_ai_voice_assistant_console --host 192.168.1.240 --port 10000 [--speak]

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

family_ai_voice_assistant_core-0.1.0.tar.gz (77.0 kB view details)

Uploaded Jan 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

family_ai_voice_assistant_core-0.1.0-py3-none-any.whl (92.5 kB view details)

Uploaded Jan 8, 2025 Python 3

File details

Details for the file family_ai_voice_assistant_core-0.1.0.tar.gz.

File metadata

Download URL: family_ai_voice_assistant_core-0.1.0.tar.gz
Upload date: Jan 8, 2025
Size: 77.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.11.11 Linux/6.8.0-51-generic

File hashes

Hashes for family_ai_voice_assistant_core-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b0e55ab84f729cc6fe7e8d6b62daead4bdf2c1e6d5779b6f264c2af11c1616be`
MD5	`2c5e89bd502b4111bb9f32d94156a4a9`
BLAKE2b-256	`f0f509d11dd6a1887583202fabbd0d0b4fd0b65d8dc5f0f10718c48294812f7a`

See more details on using hashes here.

File details

Details for the file family_ai_voice_assistant_core-0.1.0-py3-none-any.whl.

File metadata

Download URL: family_ai_voice_assistant_core-0.1.0-py3-none-any.whl
Upload date: Jan 8, 2025
Size: 92.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.11.11 Linux/6.8.0-51-generic

File hashes

Hashes for family_ai_voice_assistant_core-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`141be8f6861ca372e4c1b1d620269a3440c4e76fe6b8ff29ad3ac94335cba0c2`
MD5	`740ad3fbecd6fbf1dd2efb4886f36e17`
BLAKE2b-256	`29c3d0faa75c9b29ebd6233d81e899f3ff2e6160aa26b14554b2f7c2d3ae6bed`

See more details on using hashes here.

family-ai-voice-assistant-core 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Family AI Voice Assistant Core

Project Overview

Design

Main Architecture

Module Overview

Other Features

Logging

Telemetry

Tools Engine

Configs

Usage Guide

Environment Preparation

Example Code

Other Features

File Server

Assistant API

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes