ivrit.ai helper package
Project description
ivrit
Python package providing wrappers around ivrit.ai's capabilities.
Installation
pip install ivrit
Usage
Audio Transcription
The ivrit package provides audio transcription functionality using multiple engines.
Basic Usage
import ivrit
# Transcribe a local audio file
model = ivrit.load_model(engine="faster-whisper", model="ivrit-ai/whisper-large-v3-turbo-ct2")
result = model.transcribe(path="audio.mp3")
# With custom device
model = ivrit.load_model(engine="faster-whisper", model="ivrit-ai/whisper-large-v3-turbo-ct2", device="cpu")
result = model.transcribe(path="audio.mp3")
print(result["text"])
Transcribe from URL
# Transcribe audio from a URL
model = ivrit.load_model(engine="faster-whisper", model="ivrit-ai/whisper-large-v3-turbo-ct2")
result = model.transcribe(url="https://example.com/audio.mp3")
print(result["text"])
Streaming Results
# Get results as a stream (generator)
model = ivrit.load_model(engine="faster-whisper", model="base")
for segment in model.transcribe(path="audio.mp3", stream=True, verbose=True):
print(f"{segment.start:.2f}s - {segment.end:.2f}s: {segment.text}")
# Or use the model directly
model = ivrit.FasterWhisperModel(model="base")
for segment in model.transcribe(path="audio.mp3", stream=True):
print(f"{segment.start:.2f}s - {segment.end:.2f}s: {segment.text}")
# Access word-level timing
for segment in model.transcribe(path="audio.mp3", stream=True):
print(f"Segment: {segment.text}")
for word in segment.extra_data.get('words', []):
print(f" {word['start']:.2f}s - {word['end']:.2f}s: '{word['word']}'")
Async Transcription (RunPod Only)
For RunPod models, you can use async transcription for better performance:
import asyncio
from ivrit.audio import load_model
async def transcribe_async():
# Load RunPod model
model = load_model(
engine="runpod",
model="large-v3-turbo",
api_key="your-api-key",
endpoint_id="your-endpoint-id"
)
# Stream results asynchronously
async for segment in model.transcribe_async(path="audio.mp3", language="he"):
print(f"{segment.start:.2f}s - {segment.end:.2f}s: {segment.text}")
# Run the async function
asyncio.run(transcribe_async())
Note: Async transcription is only available for RunPod models. The sync transcribe() method uses the original sync implementation.
API Reference
load_model()
Load a transcription model for the specified engine and model.
Parameters
- engine (
str): Transcription engine to use. Options:"faster-whisper","stable-ts" - model (
str): Model name for the selected engine - device (
str, optional): Device to use for inference. Default:"auto". Options:"auto","cpu","cuda","cuda:0", etc. - model_path (
str, optional): Custom path to the model (for faster-whisper)
Returns
TranscriptionModelobject that can be used for transcription
Raises
ValueError: If the engine is not supportedImportError: If required dependencies are not installed
transcribe() and transcribe_async()
Transcribe audio using the loaded model.
Parameters
- path (
str, optional): Path to the audio file to transcribe - url (
str, optional): URL to download and transcribe - blob (
str, optional): Base64 encoded blob data to transcribe - language (
str, optional): Language code for transcription (e.g., 'he' for Hebrew, 'en' for English) - stream (
bool, optional): Whether to return results as a generator (True) or full result (False) - only fortranscribe() - diarize (
bool, optional): Whether to enable speaker diarization - verbose (
bool, optional): Whether to enable verbose output - **kwargs: Additional keyword arguments for the transcription model
Returns
transcribe(): Ifstream=True: Generator yielding transcription segments, Ifstream=False: Complete transcription result as dictionarytranscribe_async(): AsyncGenerator yielding transcription segments
Raises
ValueError: If multiple input sources are provided, or none is providedFileNotFoundError: If the specified path doesn't existException: For other transcription errors
Note: transcribe_async() is only available for RunPod models and always returns an AsyncGenerator.
Architecture
The ivrit package uses an object-oriented design with a base TranscriptionModel class and specific implementations for each transcription engine.
Model Classes
TranscriptionModel: Abstract base class for all transcription modelsFasterWhisperModel: Implementation for the Faster Whisper engine
Usage Patterns
Pattern 1: Using load_model() (Recommended)
# Step 1: Load the model
model = ivrit.load_model(engine="faster-whisper", model="base")
# Step 2: Transcribe audio
result = model.transcribe(path="audio.mp3")
Pattern 2: Direct Model Creation
# Create model directly
model = ivrit.FasterWhisperModel(model="base")
# Use the model
result = model.transcribe(path="audio.mp3")
Multiple Transcriptions
For multiple transcriptions, load the model once and reuse it:
# Load model once
model = ivrit.load_model(engine="faster-whisper", model="base")
# Use for multiple transcriptions
result1 = model.transcribe(path="audio1.mp3")
result2 = model.transcribe(path="audio2.mp3")
result3 = model.transcribe(path="audio3.mp3")
Installation
Basic Installation
pip install ivrit
With Faster Whisper Support
pip install ivrit[faster-whisper]
Supported Engines
faster-whisper
Fast and accurate speech recognition using the Faster Whisper model.
Model Class: FasterWhisperModel
Available Models: base, large, small, medium, large-v2, large-v3
Features:
- Word-level timing information
- Language detection with confidence scores
- Support for custom devices (CPU, CUDA, etc.)
- Support for custom model paths
- Streaming transcription
Dependencies: faster-whisper>=1.1.1
stable-ts
Stable and reliable transcription using Stable-TS models.
Status: Not yet implemented
Development
Installation for Development
git clone <repository-url>
cd ivrit
pip install -e ".[dev]"
Running Tests
pytest
Code Formatting
black .
isort .
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ivrit-0.1.8.tar.gz.
File metadata
- Download URL: ivrit-0.1.8.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ba23d4e85d368adee5db02849b84ebc1bf08389e91c5063561e8884c79cebf4
|
|
| MD5 |
c8d2915395e9b35df817871004e76352
|
|
| BLAKE2b-256 |
35c1e840c4862d56be1d60c9454c304110abaf6d8fa4762eab4b8d5d0572767a
|
File details
Details for the file ivrit-0.1.8-py3-none-any.whl.
File metadata
- Download URL: ivrit-0.1.8-py3-none-any.whl
- Upload date:
- Size: 23.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da483653bab338f66469f5af92537db66d251375714c6234ead2b2162c41ea53
|
|
| MD5 |
a3065ee7ce103b037c09f33518b16928
|
|
| BLAKE2b-256 |
dcf644747dfcc4fa899e8c8fe23d68c1b5e5e429911c0daa8c4613f4062419a0
|