Package to speak with OpenAI's GPT-3 model
Project description
ChatGPT Voice Assistant
A simple interface to the OpenAI ChatGPT model with speech to text for input and text to speech for the output. chatgpt-voice-assistant uses Google Translate's text-to-speech free API for audio input and output (not OpenAI Whisper).
Setup
Mac Prerequisites
Install dependencies:
brew install portaudio
brew link portaudio
Update your pydistutils config file for portaudio usage by running the following:
echo "[build_ext]" >> $HOME/.pydistutils.cfg
echo "include_dirs="`brew --prefix portaudio`"/include/" >> $HOME/.pydistutils.cfg
echo "library_dirs="`brew --prefix portaudio`"/lib/" >> $HOME/.pydistutils.cfg
General Setup
Optionally create a new Python environment and activate it:
# create a new environment in the current directory called env
python3 -m venv env
# activate the environment
source env/bin/activate
Finally, run the following to install all required Python packages and the chatgpt_voice_assistant package in editable mode:
pip install -e .
To install the bash command chatgpt-assist
, run pip install .
.
Running the Script
Either set the OPENAI_API_KEY
environment variable before running the script or pass in your secret key to the script like in the example below:
# explicitly
python chatgpt_voice_assistant/main.py --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>
or run with the installed bash command:
chatgpt-assist --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>
Start speaking and turn up your volume to hear the AI assistant respond.
Say the word "exit" or hit Ctrl+C in your terminal to stop the application.
Options
Below is the help menu from the chatgpt-assist CLI detailing all available options:
-h, --help
show this help message and exit
--log-level LOG_LEVEL
Whether to print at the debug level or not.
--input-device-name INPUT_DEVICE_NAME
The input device name.
--lang LANG
The language to listen for when running speech to text (ex. en or fr).
--max-tokens MAX_TOKENS
Max OpenAI completion tokens to use for text generation.
--tld TLD
Top level domain (ex. com or com.au).
--safe-word SAFE_WORD
Word to speak to exit the application.
--wake-word WAKE_WORD
(Optional) Word to trigger a response.
--open-ai-key OPEN_AI_KEY
Required. Open AI Secret Key (or set OPENAI_API_KEY environment variable)
--tts {apple,google}
Choose a text-to-speech engine ('apple' (say) or 'google' (gtts), defaults to 'google')
--speech-rate SPEECH_RATE
The rate at which to play speech. 1.0=normal
Specifying an Output Language Accent
Specify both the LANGUAGE
and TOP_LEVEL_DOMAIN
vars to override the default English (United States)
python chatgpt_voice_assistant/main.py --open-ai-key=<OPENAI_KEY> --lang=en --tld=com
Language Examples
- English (United States) DEFAULT
LANGUAGE=en TOP_LEVEL_DOMAIN=com
- English (Australia)
LANGUAGE=en TOP_LEVEL_DOMAIN=com.au
- English (India)
LANGUAGE=en TOP_LEVEL_DOMAIN=co.in
- French (France)
LANGUAGE=fr TOP_LEVEL_DOMAIN=fr
See Localized 'accents' section on gTTS docs for more information
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for chatgpt_voice_assistant-1.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b10c0bd3823d3a1b3b19da1052af6b53e95c25e4a83ba7df92c10549de0c69a2 |
|
MD5 | 7f72c1d837a3a49f9fbd1287cdfa02ec |
|
BLAKE2b-256 | 565c28a79d0a113270026157e25ce3c7390e1dd89085edf253fe37c5960e41ec |