Skip to main content

Package to speak with OpenAI's GPT-3 model

Project description

ChatGPT Voice Assistant

GitHub Actions Build Status

A simple interface to the OpenAI ChatGPT model with speech to text for input and text to speech for the output. chatgpt-voice-assistant uses Google Translate's text-to-speech free API for audio input and output (not OpenAI Whisper).

Setup

Mac Prerequisites

Install dependencies:

brew install portaudio
brew link portaudio

Update your pydistutils config file for portaudio usage by running the following:

echo "[build_ext]" >> $HOME/.pydistutils.cfg
echo "include_dirs="`brew --prefix portaudio`"/include/" >> $HOME/.pydistutils.cfg
echo "library_dirs="`brew --prefix portaudio`"/lib/" >> $HOME/.pydistutils.cfg

General Setup

Optionally create a new Python environment and activate it:

# create a new environment in the current directory called env
python3 -m venv env

# activate the environment
source env/bin/activate

Finally, run the following to install all required Python packages and the chatgpt_voice_assistant package in editable mode:

pip install -e .

To install the bash command chatgpt-assist, run pip install ..

Running the Script

Either set the OPENAI_API_KEY environment variable before running the script or pass in your secret key to the script like in the example below:

# explicitly
python chatgpt_voice_assistant/main.py --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>

or run with the installed bash command:

chatgpt-assist --log-level=INFO --open-ai-key=<OPEN API SECRET KEY HERE>

Start speaking and turn up your volume to hear the AI assistant respond.

Say the word "exit" or hit Ctrl+C in your terminal to stop the application.

Options

Below is the help menu from the chatgpt-assist CLI detailing all available options:

-h, --help
    show this help message and exit

--log-level LOG_LEVEL
    Whether to print at the debug level or not.

--input-device-name INPUT_DEVICE_NAME
    The input device name.

--lang LANG
    The language to listen for when running speech to text (ex. en or fr).

--max-tokens MAX_TOKENS
    Max OpenAI completion tokens to use for text generation.

--tld TLD
    Top level domain (ex. com or com.au).

--safe-word SAFE_WORD
    Word to speak to exit the application.

--wake-word WAKE_WORD
    (Optional) Word to trigger a response.

--open-ai-key OPEN_AI_KEY
    Required. Open AI Secret Key (or set OPENAI_API_KEY environment variable)

--tts {apple,google}
    Choose a text-to-speech engine ('apple' (say) or 'google' (gtts), defaults to 'google')

--speech-rate SPEECH_RATE
    The rate at which to play speech. 1.0=normal

Specifying an Output Language Accent

Specify both the LANGUAGE and TOP_LEVEL_DOMAIN vars to override the default English (United States)

python chatgpt_voice_assistant/main.py --open-ai-key=<OPENAI_KEY> --lang=en --tld=com

Language Examples

  • English (United States) DEFAULT
    • LANGUAGE=en TOP_LEVEL_DOMAIN=com
  • English (Australia)
    • LANGUAGE=en TOP_LEVEL_DOMAIN=com.au
  • English (India)
    • LANGUAGE=en TOP_LEVEL_DOMAIN=co.in
  • French (France)
    • LANGUAGE=fr TOP_LEVEL_DOMAIN=fr

See Localized 'accents' section on gTTS docs for more information

References

SpeechRecognition library docs

Google Translate Text-to-Speech API (gTTS)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

chatgpt_voice_assistant-1.3.0-py3-none-any.whl (22.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page