Whisper for your microphone

These details have not been verified by PyPI

Project description

Whisper Mic

This repo is based on the work done here by OpenAI. This repo allows you use use a mic as demo. This repo copies some of the README from original project.

Video Tutorial

See the video tutorial for this repo here

The video is a bit out of date now. The code is much better now and pip installable

Professional Assistance

If are in need of paid professional help, that is available through this email

Setup

Now a pip package!

Create a venv of your choice.
Run pip install whisper-mic

Available models and languages

There are five model sizes, four with English-only versions, offering speed and accuracy tradeoffs. Below are the names of the available models and their approximate memory requirements and relative speed.

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

For English-only applications, the .en models tend to perform better, especially for the tiny.en and base.en models. We observed that the difference becomes less significant for the small.en and medium.en models.

Microphone Demo

You can use the model with a microphone using the whisper_mic program. Use -h to see flag options.

Some of the more important flags are the --model and --english flags.

Usage In Other Projects

You can use this code in other projects rather than just use it for a demo. You can do this with the listen method.

from whisper_mic.whisper_mic import WhisperMic

mic = WhisperMic()
result = mic.listen()
print(result)

Check out what the possible arguments are by looking at the cli.py file

Troubleshooting

If you are having issues with the mic.py not running try the following:

sudo apt install portaudio19-dev python3-pyaudio

Contributing

Some ideas that you can add are:

Supporting different implementations of Whisper
Adding additional optional functionality.
Use Pyaudio to get the audio for the listen method to speed things up

License

The model weights of Whisper are released under the MIT License. See their repo for more information.

This code under this repo is under the MIT license. See LICENSE for further details.

Thanks

Until recently, access to high performing speech to text models was only available through paid serviecs. With this release, I am excited for the many applications that will come.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.4.4

Jul 4, 2024

1.4.3

Jul 4, 2024

1.4.2

Mar 27, 2024

1.4.1

Mar 27, 2024

1.4.0

Jan 23, 2024

1.3.1

Nov 21, 2023

1.3.0

Oct 23, 2023

1.2.1

Aug 9, 2023

1.2.0

Aug 9, 2023

This version

1.1.1

Jul 5, 2023

1.1.0

Jul 5, 2023

1.0.1

Jun 29, 2023

1.0.0

Jun 29, 2023

0.0.1

Apr 21, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_mic-1.1.1.tar.gz (6.0 kB view details)

Uploaded Jul 5, 2023 Source

Built Distribution

whisper_mic-1.1.1-py3-none-any.whl (6.7 kB view details)

Uploaded Jul 5, 2023 Python 3

File details

Details for the file whisper_mic-1.1.1.tar.gz.

File metadata

Download URL: whisper_mic-1.1.1.tar.gz
Upload date: Jul 5, 2023
Size: 6.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for whisper_mic-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`aae3878ad6c80f0e6ab094b8112f05147dae027b56e84cd13d5de96d35e7a5e1`
MD5	`84eefe061e8e1092ac303e7007dc7953`
BLAKE2b-256	`6e049a745da5c07aa45fa258ca48963bd412fde81cdbac3ec95f8fd4944b9474`

See more details on using hashes here.

File details

Details for the file whisper_mic-1.1.1-py3-none-any.whl.

File metadata

Download URL: whisper_mic-1.1.1-py3-none-any.whl
Upload date: Jul 5, 2023
Size: 6.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for whisper_mic-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cd2018747ec8bf5051915ee503f76331841ad9968703cc15679c47591aaafc73`
MD5	`7d9a4204dbd5058ea0dc0b46028932f8`
BLAKE2b-256	`841ad0a50b98ec723ee6e806655a3e99e5a45ad40ff9bae6e682f18b1b3fbd07`