Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.
Project description
CORA
Description:
Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.
This project is also using amazon AWS's Polly service for voice synthesis and the speechrecognition library utilising google's text to speech for user speech recognition. We are also using pydub and simpleaudio to play the audio coming back from Amazon AWS Polly service without having to write any audio files on the disk.
Project Dependancies:
- Python 3.11.6
- OpenAI API Key
- AWS Polly Key
- Microsoft Visual C++ 14.0 or greater
- SpeechRecognition
- simpleaudio
- pydub
- boto3
- python-dotenv
- openai
- pyaudio
Road Map:
Initial text and speech recognitionSynthesize voice from AWS PollyIntegration with openai chatgptUpgrade the openai ai service to use function callingSimple utility functions for logging to the screenSimple activation on wake-up words- Simple speech visualiser using pygame
- change visualisation depending on sleeping or not sleeping
- Display logging output in the visualiser
Make it easier to setup the project from scratch (use poetry)- Report daily schedule skill function
- Allow cora to monitor things and report back/notify as events occur (third thread)
- Make unit tests
- Store message history between sessions
- Build and implement ML model for wake-up word detection
- Support for local LLM instead of using chatgpt service
Getting Started:
-
Install Python 3.11.6 from: https://www.python.org/downloads/release/python-3116/
- 3.11.6 is required at the moment because this is the latest version supported by pyaudio
-
Clone this repo:
git clone https://github.com/Nixxs/cora.git
- Setup your local .env file in the project root:
AWS_ACCESS_KEY = "[YOUR OWN AWS ACCESS KEY]"
AWS_SECRET_KEY = "[THE CORRESPONDING SECRET KEY]"
AWS_REGION = "[AWS REGION YOU WANT TO USE]"
OPENAI_KEY = "[OPENAI API KEY]"
CHATGPT_MODEL = "gpt-3.5-turbo-0613"
cora uses the amazon aws polly service for it's voice synthesis. To access this service, you will need to generate a key and secret on your amazon aws account that has access to the polly service. You'll also want to define your aws region here too as well as your openai key and the chatgpt model you want to use, make sure the model supports function calling otherwise cora's skill functions won't work (at time of writing either gpt-3.5-turbo-0613 or gpt-4-0613).
- Install dependancies using poetry is easiest:
poetry install
OPTIONAL: pydub generally also needs ffmpeg installed as well if you want to do anything with audio file formats or editing the audio at all. This project doesn't require any of that (at least not yet) as we just use simpleaudio to play the stream. However, you will get a warning from pydub on import if you don't have ffmpeg installed.
You can download it from here to cover all bases, you will also need to add it to your PATH:
- Then just run the entry script using
poetry run cora
- How to use CORA:
- The wake word for cora is "cora" at start up cora won't do anything except listen for the wake word.
- If the wake word is detected, cora will respond.
- you can say 'cora' and your query in a single sentance and cora will both wake up and respond.
- after cora has awoken, you can continue your conversation until you specifically ask cora to either go to 'sleep' or or 'shut down'.
- in 'sleep' mode, cora will stop responding until you say the wake word
- if you asked cora to 'shut down' at any point, cora's loops will end gracefully and the program will exit
Additional Notes:
- Conversations are logged in the cora/logs folder and organised by date
- CORA relies on lots of external services like google text to speech, even when sleeping cora is sending microphone information to google to check if the wake-word was detected from the audio. At some stage we will have a local model to detect this instead but for now it's all going to google so be wary of that.
- Take a look cora's skills in the cora_skills.py file, make your own skills that might be relevant to you. Skills are activated when ChatGPT thinks the user wants to use one of the skills and give's cora access to everything you'd want to do (you just have to write the skill).
Local Voices:
In an earlier version of the project we were using local voices, at some stage this might still be useful if we don't want to pay for AWS Polly anymore.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file corava-0.1.0.tar.gz
.
File metadata
- Download URL: corava-0.1.0.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.6 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b36cab323de77e6dd5a126be6ecba9458ec67a97051728aa1a2060e3b3c59caa |
|
MD5 | 342498835e2e612b29f677de6e8d52e5 |
|
BLAKE2b-256 | 0cada9255a021214b896634a8cbc88f6d3bc26030d50bb4cab8e1409b89348b0 |
File details
Details for the file corava-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: corava-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.6 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3413402f2d17a636d48e3bd36f89db0cc8f64ab73604d491af704b5cb8ed0602 |
|
MD5 | 447f7c9a772777a1a2844bc48ee3dd70 |
|
BLAKE2b-256 | b23b6d0bc5b7015d35f31ea7c795bf99d6f76f495c9c413c64e5da03dcff2c0a |