Core library powering a GUI providing an audio interface to GPT3.
Project description
Jabberwocky
Video demo: https://user-images.githubusercontent.com/40480855/132139847-0d0014b9-022e-4684-80bf-d46031ca4763.mp4
This was not really designed to be used as a standalone library - it was mostly used as a convenient way to structure and import code in other parts of the project. Some components may be reusable for other projects combining GPT-3 with audio, however.
Project Description
This is the core library powering a GUI that provides an audio interface to GPT-3. My main goal was to provide a convenient way to interact with various experts or public figures: imagine discussing physics with Einstein or hip hop with Kanye (or hip hop with Einstein? 🤔). I often find writing and speaking to be wildly different experiences and I imagined the same would be true when interacting with GPT-3. This turned out to be partially true - the default Mac text-to-speech functionality I'm using here is certainly not as lifelike as I'd like. Perhaps more powerful audio generation methods will pop up in a future release...
We also provide Task Mode containing built-in prompts for a number of sample tasks:
- Summarization
- Explain like I'm 5
- Translation
- How To (step by step instructions for performing everyday tasks)
- Writing Style Analysis
- Explain machine learning concepts in simple language
- Generate ML paper abstracts
- MMA Fight Analysis and Prediction
Getting Started
- Clone the repo.
git clone https://github.com/hdmamin/jabberwocky.git
- Install the necessary packages. I recommend using a virtual environment of some kind (virtualenv, conda, etc). If you're not using Mac OS, you could try installing portaudio with whatever package manager you're using, but app behavior on other systems is unknown.
brew install portaudio
pip install -r requirements.txt
python -m nltk.downloader punkt
If you have make
installed you can simply use the command:
make install
- Add your openai API key somewhere the program can access it. There are two ways to do this:
echo your_openai_api_key > ~/.openai
or
export OPENAI_API_KEY=your_openai_api_key
(Make sure to use your actual key, not the literal text your_openai_api_key
.)
- Run the app.
python gui/main.py
Or with make
:
make run
Usage
Conversation Mode
In conversation mode, you can chat with a number of pre-defined personas or add new ones. New personas can be autogenerated or defined manually.
See data/conversation_personas
for examples of autogenerated prompts. You can likely achieve better results using custom prompts though.
Conversation mode only supports spoken input, though you can edit flawed transcriptions manually. Querying GPT-3 with nonsensical or ungrammatical text will negatively affect response quality.
Task Mode
In task mode, you can ask GPT-3 to perform a number pre-defined tasks. Written and spoken input are both supported. By default, GPT-3's response is both typed out and read aloud.
Transcripts of responses from a small subset of non-conversation tasks can be found in the data/completions
directory. You can also save your own completions while using the app.
Usage Notes
The first time you speak, the speech transcription back end will take a few seconds to calibrate to the level of ambient noise in your environment. You will know it's ready for transcription when you see a "Listening..." message appear below the Record button. Calibration only occurs once to save time.
Hotkeys
CTRL + SHIFT: Start recording audio (same as pressing the "Record" button).
CTRL + a: Get GPT-3's response to whatever input you've recorded (same as pressing the "Get Response" button).
Project Members
- Harrison Mamin
Repo Structure
jabberwocky/
├── data # Raw and processed data. Some files are excluded from github but the ones needed to run the app are there.
├── notes # Miscellaneous notes from the development process stored as raw text files.
├── notebooks # Jupyter notebooks for experimentation and exploratory analysis.
├── reports # Markdown reports (performance reports, blog posts, etc.)
├── gui # GUI scripts. The main script should be run from the project root directory.
└── lib # Python package. Code can be imported in analysis notebooks, py scripts, etc.
Start of auto-generated file data.
Last updated: 2021-09-12 13:55:26
File | Summary | Line Count | Last Modified | Size |
---|---|---|---|---|
__init__.py | _ | 1 | 2021-09-12 13:54:19 | 22.00 b |
config.py | Define constants used throughout the project. | 21 | 2021-07-22 20:29:41 | 564.00 b |
core.py | Core functionality that ties together multiple APIs. | 609 | 2021-09-06 13:39:53 | 24.21 kb |
external_data.py | Functionality to fetch and work with YouTube transcripts. | 281 | 2021-08-06 20:25:01 | 9.97 kb |
openai_utils.py | Utility functions for interacting with the gpt3 api. | 1320 | 2021-09-01 20:12:51 | 55.00 kb |
speech.py | Module to help us interact with mac's speech command. This lets the GUI read responses out loud. |
117 | 2021-08-20 21:05:11 | 4.16 kb |
utils.py | General purpose utilities. | 337 | 2021-08-04 20:02:18 | 10.93 kb |
End of auto-generated file data. Do not add anything below this.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file jabberwocky-1.0.1.tar.gz
.
File metadata
- Download URL: jabberwocky-1.0.1.tar.gz
- Upload date:
- Size: 39.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.26.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94b53c253971565807c21ab26de730a59be6e1fb9363791a1e07ebc8c6d898c7 |
|
MD5 | 36b6c6fb7a68e8abe27aecb80e972983 |
|
BLAKE2b-256 | 42d177d57c27d862ff298c62859fb2075e5bc01f491f19fb90e52a22748270cb |