Skip to main content

Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.

Project description

CORA

Description:

Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.

This project is also using amazon AWS's Polly service for voice synthesis and the speechrecognition library utilising google's text to speech for user speech recognition. We are also using pydub and simpleaudio to play the audio coming back from Amazon AWS Polly service without having to write any audio files on the disk.

Project Dependancies:

  • Python 3.11.6
  • OpenAI API Key
  • AWS Polly Key
  • Microsoft Visual C++ 14.0 or greater
  • SpeechRecognition
  • simpleaudio
  • pydub
  • boto3
  • python-dotenv
  • openai
  • pyaudio

Road Map:

  • Initial text and speech recognition
  • Synthesize voice from AWS Polly
  • Integration with openai chatgpt
  • Upgrade the openai ai service to use function calling
  • Simple utility functions for logging to the screen
  • Simple activation on wake-up words
  • Simple speech visualiser using pygame
  • change visualisation depending on sleeping or not sleeping
  • Display logging output in the visualiser
  • Make it easier to setup the project from scratch (use poetry)
  • Report daily schedule skill function
  • Allow cora to monitor things and report back/notify as events occur (third thread)
  • Make unit tests
  • Store message history between sessions
  • Build and implement ML model for wake-up word detection
  • Support for local LLM instead of using chatgpt service

Getting Started:

  1. Install Python 3.11.6 from: https://www.python.org/downloads/release/python-3116/

    • 3.11.6 is required at the moment because this is the latest version supported by pyaudio
  2. Clone this repo:

git clone https://github.com/Nixxs/cora.git
  1. Setup your local .env file in the project root:
AWS_ACCESS_KEY = "[YOUR OWN AWS ACCESS KEY]"
AWS_SECRET_KEY = "[THE CORRESPONDING SECRET KEY]"
AWS_REGION = "[AWS REGION YOU WANT TO USE]"
OPENAI_KEY = "[OPENAI API KEY]"
CHATGPT_MODEL = "gpt-3.5-turbo-0613"

cora uses the amazon aws polly service for it's voice synthesis. To access this service, you will need to generate a key and secret on your amazon aws account that has access to the polly service. You'll also want to define your aws region here too as well as your openai key and the chatgpt model you want to use, make sure the model supports function calling otherwise cora's skill functions won't work (at time of writing either gpt-3.5-turbo-0613 or gpt-4-0613).

  1. Install dependancies using poetry is easiest:
poetry install

OPTIONAL: pydub generally also needs ffmpeg installed as well if you want to do anything with audio file formats or editing the audio at all. This project doesn't require any of that (at least not yet) as we just use simpleaudio to play the stream. However, you will get a warning from pydub on import if you don't have ffmpeg installed.

You can download it from here to cover all bases, you will also need to add it to your PATH:

  1. Then just run the entry script using
poetry run cora
  1. How to use CORA:
  • The wake word for cora is "cora" at start up cora won't do anything except listen for the wake word.
  • If the wake word is detected, cora will respond.
    • you can say 'cora' and your query in a single sentance and cora will both wake up and respond.
  • after cora has awoken, you can continue your conversation until you specifically ask cora to either go to 'sleep' or or 'shut down'.
    • in 'sleep' mode, cora will stop responding until you say the wake word
    • if you asked cora to 'shut down' at any point, cora's loops will end gracefully and the program will exit

Additional Notes:

  • Conversations are logged in the cora/logs folder and organised by date
  • CORA relies on lots of external services like google text to speech, even when sleeping cora is sending microphone information to google to check if the wake-word was detected from the audio. At some stage we will have a local model to detect this instead but for now it's all going to google so be wary of that.
  • Take a look cora's skills in the cora_skills.py file, make your own skills that might be relevant to you. Skills are activated when ChatGPT thinks the user wants to use one of the skills and give's cora access to everything you'd want to do (you just have to write the skill).

Local Voices:

In an earlier version of the project we were using local voices, at some stage this might still be useful if we don't want to pay for AWS Polly anymore.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

corava-0.1.0.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

corava-0.1.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file corava-0.1.0.tar.gz.

File metadata

  • Download URL: corava-0.1.0.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Windows/10

File hashes

Hashes for corava-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b36cab323de77e6dd5a126be6ecba9458ec67a97051728aa1a2060e3b3c59caa
MD5 342498835e2e612b29f677de6e8d52e5
BLAKE2b-256 0cada9255a021214b896634a8cbc88f6d3bc26030d50bb4cab8e1409b89348b0

See more details on using hashes here.

File details

Details for the file corava-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: corava-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.6 Windows/10

File hashes

Hashes for corava-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3413402f2d17a636d48e3bd36f89db0cc8f64ab73604d491af704b5cb8ed0602
MD5 447f7c9a772777a1a2844bc48ee3dd70
BLAKE2b-256 b23b6d0bc5b7015d35f31ea7c795bf99d6f76f495c9c413c64e5da03dcff2c0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page