Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.

These details have not been verified by PyPI

Project description

corava

CORA Virtual Assistant

Description:

This project is also using amazon AWS's Polly service for voice synthesis and the speechrecognition library utilising google's text to speech for user speech recognition. We are also using pydub and simpleaudio to play the audio coming back from Amazon AWS Polly service without having to write any audio files on the disk.

Getting Started:

Install the corava library from pip:

pip install corava

Get all your API keys and setup a .env or just feed them into config if you want. Here is an example using .env.

from corava import cora
from dotenv import load_dotenv
import os

load_dotenv() # take environment variables from .env.

def main():
    config = {
        "AWS_ACCESS_KEY" : os.getenv('AWS_ACCESS_KEY'),
        "AWS_SECRET_KEY" : os.getenv('AWS_SECRET_KEY'),
        "AWS_REGION" : os.getenv('AWS_REGION'),
        "OPENAI_KEY" : os.getenv('OPENAI_KEY'),
        "CHATGPT_MODEL" : os.getenv('CHATGPT_MODEL')
    }
    conversation_history = cora.start(config)
    print(conversation_history)

if __name__ == "__main__":
    main()

Project Dependancies:

Python 3.11.6
OpenAI API Key
AWS Polly Key
Microsoft Visual C++ 14.0 or greater
SpeechRecognition
simpleaudio
pydub
boto3
python-dotenv
openai
pyaudio

Road Map (Core):

~~Initial text and speech recognition~~
~~Synthesize voice from AWS Polly~~
~~Integration with openai chatgpt~~
~~Upgrade the openai ai service to use function calling~~
~~Simple utility functions for logging to the screen~~
~~Simple activation on wake-up words~~
~~update skills to support parallel function calling~~
~~Simple speech visualiser using pygame~~
~~change visualisation depending on sleeping or not sleeping~~
~~Display logging output in the visualiser~~
~~Make it easier to setup the project from scratch (use poetry)~~
~~setup the project so it can be used from pypi~~
~~manage the conversation history better to work more effciently with the token limit~~
Allow cora to monitor things and report back/notify as events occur (third thread)
remember message history between sessions
Build and implement ML model for wake-up word detection
- actually we can probably use this instead: https://github.com/mallorbc/whisper_mic
Support for local LLM instead of using chatgpt service

Road Map (Active Skills):

Report daily outlook calendar schedule
Make the weather function call actually work
Report latest most relevant news for a given location
Play youtube music (have a look at whats available in youtube apis)
Open youtube videos (have a look at whats available in youtube apis)
look up information using google maps (directions, distance to)
generate an image and open it (openai DALL-E image api)

Road Map (Monitoring Skills):

Monitor calendar and notify of next meeting

Setting up your dev environment:

Install Python 3.11.6 from: https://www.python.org/downloads/release/python-3116/
- 3.11.6 is required at the moment because this is the latest version supported by pyaudio
Clone this repo:

git clone https://github.com/Nixxs/cora.git

Setup your local .env file in the project root:

AWS_ACCESS_KEY = "[YOUR OWN AWS ACCESS KEY]"
AWS_SECRET_KEY = "[THE CORRESPONDING SECRET KEY]"
AWS_REGION = "[AWS REGION YOU WANT TO USE]"
OPENAI_KEY = "[OPENAI API KEY]"
CHATGPT_MODEL = "gpt-3.5-turbo-0613"

cora uses the amazon aws polly service for it's voice synthesis. To access this service, you will need to generate a key and secret on your amazon aws account that has access to the polly service. You'll also want to define your aws region here too as well as your openai key and the chatgpt model you want to use, make sure the model supports function calling otherwise cora's skill functions won't work (at time of writing either gpt-3.5-turbo-0613 or gpt-4-0613).

Install dependancies using poetry is easiest:

poetry install

OPTIONAL: pydub generally also needs ffmpeg installed as well if you want to do anything with audio file formats or editing the audio at all. This project doesn't require any of that (at least not yet) as we just use simpleaudio to play the stream. However, you will get a warning from pydub on import if you don't have ffmpeg installed.

You can download it from here to cover all bases, you will also need to add it to your PATH:

https://github.com/BtbN/FFmpeg-Builds/releases

Then just run the entry script using

poetry run cora

How to use CORA:

The wake word for cora is "cora" at start up cora won't do anything except listen for the wake word.
If the wake word is detected, cora will respond.
- you can say 'cora' and your query in a single sentance and cora will both wake up and respond.
after cora has awoken, you can continue your conversation until you specifically ask cora to either go to 'sleep' or or 'shut down'.
- in 'sleep' mode, cora will stop responding until you say the wake word
- if you asked cora to 'shut down' at any point, cora's loops will end gracefully and the program will exit

Additional Notes:

Conversations are logged in the cora/logs folder and organised by date
CORA relies on lots of external services like google text to speech, even when sleeping cora is sending microphone information to google to check if the wake-word was detected from the audio. At some stage we will have a local model to detect this instead but for now it's all going to google so be wary of that.
Take a look cora's skills in the cora_skills.py file, make your own skills that might be relevant to you. Skills are activated when ChatGPT thinks the user wants to use one of the skills and give's cora access to everything you'd want to do (you just have to write the skill).

Local Voices:

In an earlier version of the project we were using local voices, at some stage this might still be useful if we don't want to pay for AWS Polly anymore.

https://harposoftware.com/en/english-usa/129-salli-american-english-voice.html

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.5

Nov 13, 2023

0.2.4

Nov 13, 2023

0.2.3

Nov 13, 2023

0.2.1

Nov 12, 2023

This version

0.2.0

Nov 12, 2023

0.1.6

Nov 12, 2023

0.1.5

Nov 9, 2023

0.1.4

Nov 9, 2023

0.1.3

Nov 7, 2023

0.1.2

Nov 7, 2023

0.1.1

Nov 7, 2023

0.1.0

Nov 6, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

corava-0.2.0.tar.gz (11.9 kB view hashes)

Uploaded Nov 12, 2023 Source

Built Distribution

corava-0.2.0-py3-none-any.whl (14.7 kB view hashes)

Uploaded Nov 12, 2023 Python 3

Hashes for corava-0.2.0.tar.gz

Hashes for corava-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`70a609eb2d53a5738cb200144d08dbc9ffe5bc921619c9409fe6c5f97b23f2ed`
MD5	`0ba35321e726149f2f9057896e465e80`
BLAKE2b-256	`656f1f279aaa942fc65c89391729e1624c80efd7d65f8f9a962674600a3e891f`

Hashes for corava-0.2.0-py3-none-any.whl

Hashes for corava-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f006befe9f9556bed7460f4f5ec094c68d927a452c5cc30c916a1fa51afa92bf`
MD5	`0a97b751f9bc1e45b3b2ca9d3d74d966`
BLAKE2b-256	`f0efd5c311e8a7e866b9be6cc1134a598138ce93ded9893a57e9e41be2cd063a`