Skip to main content

Speech Recognition Library

Project description

MetaidigitSTT

MetaidigitSTT is a lightweight Speech-to-Text (STT) recognition module utilizing Groq's Whisper model for high-accuracy transcription. It enables real-time audio recording and transcription using Python.

Features

  • Records audio via microphone and saves it as an MP3 file.
  • Uses Groq's Whisper-Large-V3 model for transcription.
  • Simple and easy-to-use API.

Installation

Ensure you have Python 3.10 or higher installed.

Using Pip

pip install -r requirements.txt

Required Dependencies

  • ffmpeg
  • SpeechRecognition
  • pyaudio
  • pydub
  • groq
  • python-dotenv

You may need to install ffmpeg manually:

# Ubuntu/Linux
sudo apt install ffmpeg

# macOS
brew install ffmpeg

# Windows
choco install ffmpeg

Usage

1. Recording Audio

The record_audio function records audio from the microphone and saves it as an MP3 file.

from metaidigitstt import record_audio

record_audio("audio/test.mp3", timeout=20, phrase_time_limit=10)

2. Transcribing Audio

The transcribe_with_groq function transcribes recorded audio using Groq's Whisper model.

from metaidigitstt import transcribe_with_groq

GROQ_API_KEY = "your_api_key_here"
stt_model = "whisper-large-v3"
audio_filepath = "audio/test.mp3"

transcription = transcribe_with_groq(stt_model, audio_filepath, GROQ_API_KEY)
print(transcription)

Environment Variables

Set up your API key in a .env file:

GROQ_API_KEY=your_api_key_here

Setup for Development

Install Locally

pip install .

Running Tests

Ensure dependencies are installed and run:

pytest tests/

Packaging & Deployment

To package the module:

python setup.py sdist bdist_wheel

To install the package locally:

pip install .

License

This project is licensed under the MIT License.

Author

Suhal Samad

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaidigitstt-0.1.2.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaidigitstt-0.1.2-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file metaidigitstt-0.1.2.tar.gz.

File metadata

  • Download URL: metaidigitstt-0.1.2.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for metaidigitstt-0.1.2.tar.gz
Algorithm Hash digest
SHA256 23ac221663aeb05f26e261061eaffcca4dd3f8dcdc3ad1a542ebeec79f5cedb7
MD5 2f6e4da958cedb0e160ea6a2cc099cf8
BLAKE2b-256 73e1bb2efe56475b8cac45c30ce8276a22a2ec3cb9190a60b0b23cf682d525da

See more details on using hashes here.

File details

Details for the file metaidigitstt-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: metaidigitstt-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for metaidigitstt-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 005e2b20416be1d03f50bcfab249a32d9e4ec3312931a77c5c7ee3f4063e9191
MD5 d4af5dea62a1edddb5753087225bd3a6
BLAKE2b-256 6aba3c1e57f48c6f6e539191b789a7b33263975eb4fcc911d2700327b6fb3e5d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page