Package to speak with OpenAI's GPT-3 model
Project description
ChatGPT Voice Assistant
A simple interface to the OpenAI ChatGPT model with speech to text for input and text to speech for the output. chatgpt-voice-assistant uses Google Translate's text-to-speech free API for audio input and output (not OpenAI Whisper).
Setup
Mac Prerequisites
Install dependencies:
brew install portaudio
brew link portaudio
Update your pydistutils config file for portaudio usage by running the following:
echo "[build_ext]" >> $HOME/.pydistutils.cfg
echo "include_dirs="`brew --prefix portaudio`"/include/" >> $HOME/.pydistutils.cfg
echo "library_dirs="`brew --prefix portaudio`"/lib/" >> $HOME/.pydistutils.cfg
General Setup
Optionally create a new Python environment and activate it:
# create a new environment in the current directory called env
python3 -m venv env
# activate the environment
source env/bin/activate
Finally, run the following to install all required Python packages and the chatgpt_voice_assistant package in editable mode:
pip install -e .
To install the bash command chatgpt-assist
, run pip install .
.
Running the Script
Either set the OPENAI_API_KEY
environment variable before running the script or pass in your secret key to the script like in the example below:
# explicitly
python chatgpt_voice_assistant/main.py --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>
or run with the installed bash command:
chatgpt-assist --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>
Start speaking and turn up your volume to hear the AI assistant respond.
Say the word "exit" or hit Ctrl+C in your terminal to stop the application.
Options
Below is the help menu from the chatgpt-assist CLI detailing all available options:
-h, --help
show this help message and exit
--log-level LOG_LEVEL
Whether to print at the debug level or not.
--input-device-name INPUT_DEVICE_NAME
The input device name.
--lang LANG
The language to listen for when running speech to text (ex. en or fr).
--max-tokens MAX_TOKENS
Max OpenAI completion tokens to use for text generation.
--tld TLD
Top level domain (ex. com or com.au).
--safe-word SAFE_WORD
Word to speak to exit the application.
--wake-word WAKE_WORD
(Optional) Word to trigger a response.
--open-ai-key OPEN_AI_KEY
Required. Open AI Secret Key (or set OPENAI_API_KEY environment variable)
--tts {apple,google}
Choose a text-to-speech engine ('apple' (say) or 'google' (gtts), defaults to 'google')
--speech-rate SPEECH_RATE
The rate at which to play speech. 1.0=normal
Specifying an Output Language Accent
Specify both the LANGUAGE
and TOP_LEVEL_DOMAIN
vars to override the default English (United States)
python chatgpt_voice_assistant/main.py --open-ai-key=<OPENAI_KEY> --lang=en --tld=com
Language Examples
- English (United States) DEFAULT
LANGUAGE=en TOP_LEVEL_DOMAIN=com
- English (Australia)
LANGUAGE=en TOP_LEVEL_DOMAIN=com.au
- English (India)
LANGUAGE=en TOP_LEVEL_DOMAIN=co.in
- French (France)
LANGUAGE=fr TOP_LEVEL_DOMAIN=fr
See Localized 'accents' section on gTTS docs for more information
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file chatgpt_voice_assistant-1.3.0-py3-none-any.whl
.
File metadata
- Download URL: chatgpt_voice_assistant-1.3.0-py3-none-any.whl
- Upload date:
- Size: 22.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b10c0bd3823d3a1b3b19da1052af6b53e95c25e4a83ba7df92c10549de0c69a2 |
|
MD5 | 7f72c1d837a3a49f9fbd1287cdfa02ec |
|
BLAKE2b-256 | 565c28a79d0a113270026157e25ce3c7390e1dd89085edf253fe37c5960e41ec |