Skip to main content

Whisper for your microphone

Project description

Whisper Mic

This repo is based on the work done here by OpenAI. This repo allows you use use a mic as demo. This repo copies some of the README from the original project.

Video Tutorial

The latest video tutorial for this repo can be seen here

An older video tutorial for this repo can be seen here

Professional Assistance

If are in need of paid professional help, that is available through this email

Setup

Now a pip package!

  1. Create a venv of your choice.
  2. Run pip install whisper-mic

Available models and languages

There are five model sizes, four with English-only versions, offering speed and accuracy tradeoffs. Below are the names of the available models and their approximate memory requirements and relative speed.

Size Parameters English-only model Multilingual model Required VRAM Relative speed
tiny 39 M tiny.en tiny ~1 GB ~32x
base 74 M base.en base ~1 GB ~16x
small 244 M small.en small ~2 GB ~6x
medium 769 M medium.en medium ~5 GB ~2x
large 1550 M N/A large ~10 GB 1x

For English-only applications, the .en models tend to perform better, especially for the tiny.en and base.en models. We observed that the difference becomes less significant for the small.en and medium.en models.

Microphone Demo

You can use the model with a microphone using the whisper_mic program. Use -h to see flag options.

Some of the more important flags are the --model and --english flags.

Transcribing To A File

Using the command: whisper_mic --loop --dictate will type the words you say on your active cursor.

Usage In Other Projects

You can use this code in other projects rather than just use it for a demo. You can do this with the listen method.

from whisper_mic.whisper_mic import WhisperMic

mic = WhisperMic()
result = mic.listen()
print(result)

Check out what the possible arguments are by looking at the cli.py file

Troubleshooting

If you are having issues, try the following:

sudo apt install portaudio19-dev python3-pyaudio

Contributing

Some ideas that you can add are:

  1. Supporting different implementations of Whisper
  2. Adding additional optional functionality.
  3. Use Pyaudio to get the audio for the listen method to speed things up

License

The model weights of Whisper are released under the MIT License. See their repo for more information.

This code under this repo is under the MIT license. See LICENSE for further details.

Thanks

Until recently, access to high performing speech to text models was only available through paid serviecs. With this release, I am excited for the many applications that will come.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_mic-1.2.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

whisper_mic-1.2.1-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file whisper_mic-1.2.1.tar.gz.

File metadata

  • Download URL: whisper_mic-1.2.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for whisper_mic-1.2.1.tar.gz
Algorithm Hash digest
SHA256 31a896961d4fd887ad047e4f8f4345d6306ab619a2e9d5d7261be8e3d8c1bf4e
MD5 8bcf5b2ab2da2487c4ab99c7c0418615
BLAKE2b-256 4b967721e614895752a41a2e7b288a548e1bfed53db0fad1da522b1992602daf

See more details on using hashes here.

File details

Details for the file whisper_mic-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: whisper_mic-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for whisper_mic-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 12e470c5d2f328e8b7e7f27265d02e795f252f54b72de55f4d9f450cae834305
MD5 972daaff1d1642113ebace536357bca2
BLAKE2b-256 27c9df7e8b21eb2f5e5e92dcdb0bb8a79947123f5286467aed2383734517733b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page