Official Python SDK for Cleanvoice AI audio processing
Project description
Cleanvoice Python SDK
Python SDK for the Cleanvoice API. Use it to submit media, poll edit jobs, and download processed results from Python applications and backend services.
Features
- Audio and video processing requests
- Transcription, summarization, and social-content options
- Sync and async clients with a matching high-level API
- Typed request and response models
- Automatic retries for transient network and service failures
- Built-in support for local files, in-memory audio, NumPy helpers, and video utilities
Installation
Install the SDK:
pip install cleanvoice-sdk
Quick Start
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
result = client.process(
"https://example.com/podcast.mp3",
fillers=True,
normalize=True,
studio_sound=True,
summarize=True,
output_path="podcast_clean.wav",
)
print(f"Processed audio: {result.audio.url}")
print(f"Saved locally to: {result.audio.local_path}")
print(f"Summary: {result.transcript.summary}")
Common Usage Patterns
Most integrations fit one of these three patterns:
- Process and save in one call:
result = client.process(
"local_or_remote_media",
normalize=True,
studio_sound=True,
output_path="cleaned_output.wav",
)
- Process first, download later:
result = client.process("local_or_remote_media", normalize=True)
saved_path = result.audio.download("cleaned_output.wav")
- Use async in web backends or workers:
result = await async_client.process(
"local_or_remote_media",
normalize=True,
output_path="cleaned_output.wav",
)
In practice, output_path=... is the lowest-friction option for backend jobs because the SDK uploads, waits, downloads, and returns a ready-to-use local path in result.audio.local_path.
Authentication
Get your API key from the Cleanvoice Dashboard.
from cleanvoice import Cleanvoice
client = Cleanvoice(
api_key="your-api-key-here",
base_url="https://api.cleanvoice.ai/v2", # optional
timeout=60, # optional
)
Or set environment variables and use:
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
Network Resilience
The client automatically retries brief transient failures such as connection resets, connect/read timeouts on safe requests, and temporary HTTP responses like 429, 502, 503, and 504.
This is designed to absorb short backend restart windows without immediately failing common flows such as:
check_auth()create_edit(...)get_edit(...)process(...)while polling for completion
Retries are intentionally conservative for edit creation so short backend restarts do not immediately fail a request or duplicate work.
API Reference
process(file_input, config=None, progress_callback=None, *, output_path=None, download=False, template_id=None, upload_type=None, **options)
Process an audio or video file with AI enhancement.
Parameters:
file_input(stror(audio_array, sample_rate)): URL, local media path, or an in-memory audio array paired with its sample rateconfig(ProcessingConfigordict, optional): Processing optionsprogress_callback(callable, optional): Callback function for progress updatesoutput_path(str, optional): Save the finished audio locally as part of the taskdownload(bool, optional): Download the finished audio even whenoutput_pathis omittedtemplate_id(int, optional): Apply a saved Cleanvoice templateupload_type(str, optional): Forward a backend-specific upload type hint with the edit request**options: Direct config kwargs such asnormalize=Trueorstudio_sound=True
Returns: ProcessResult
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
def progress_callback(data):
print(f"Status: {data['status']}, Progress: {data.get('result', {}).get('done', 0)}%")
result = client.process(
"https://example.com/audio.mp3",
fillers=True,
stutters=True,
long_silences=True,
mouth_sounds=True,
breath=True,
remove_noise=True,
normalize=True,
studio_sound=True,
mute_lufs=-80,
target_lufs=-16,
export_format="wav",
summarize=True,
social_content=True,
progress_callback=progress_callback,
output_path="enhanced_audio.wav",
)
print(result.audio.url) # Download URL
print(result.audio.local_path) # Local saved file
print(result.audio.statistics) # Processing stats
print(result.transcript.text) # Full transcript
print(result.transcript.summary) # AI summary
Processing In-Memory Audio
If you already loaded audio with librosa, you can pass the returned (audio_array, sample_rate) tuple directly. The SDK writes a temporary WAV, uploads it, and continues normally.
import librosa
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
audio, sample_rate = librosa.load("local_audio.wav", sr=None, mono=True)
result = client.process(
(audio, sample_rate),
studio_sound=True,
remove_noise=True,
output_path="processed_from_array.mp3",
)
print(result.media.local_path)
create_edit(file_input, config=None, *, template_id=None, upload_type=None, **options)
Create an edit job without waiting for completion.
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
edit_id = client.create_edit(
"https://example.com/audio.mp3",
fillers=True,
normalize=True,
studio_sound=True,
upload_type="podcast",
)
print(f'Edit ID: {edit_id}')
File Upload and Download
Upload Local Files
Upload local audio/video files for processing:
import librosa
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
# Upload a file and get its URL
uploaded_url = client.upload_file("local_audio.mp3")
print(f"Uploaded to: {uploaded_url}")
# Upload with custom filename
uploaded_url = client.upload_file("local_audio.mp3", "my_custom_name.mp3")
# Process local file directly. The SDK uploads it automatically.
result = client.process("local_audio.mp3", fillers=True)
# Upload an in-memory array loaded with librosa.
audio, sample_rate = librosa.load("local_audio.wav", sr=None, mono=True)
uploaded_url = client.upload_file((audio, sample_rate), "from_array.wav")
Download Processed Files
Download the enhanced audio files:
# Download later from the result object
downloaded_path = result.audio.download("enhanced_audio.mp3")
print(f"Downloaded later to: {downloaded_path}")
# Download and get back (audio_array, sample_rate)
audio_array, sample_rate = result.download_audio(as_numpy=True)
print(audio_array.shape, sample_rate)
# Or let process handle the download inside the task
result = client.process(
"audio.mp3",
fillers=True,
normalize=True,
output_path="output.mp3",
)
print(f"Processed and saved to: {result.audio.local_path}")
# Process and download in one step
result, downloaded_path = client.process_and_download(
"audio.mp3",
"output.mp3",
fillers=True,
normalize=True,
)
print(f"Processed and saved to: {downloaded_path}")
output_path always saves the exact bytes returned by the API. The SDK does not transcode locally after download.
result.download_audio(as_numpy=True) and await result.download_audio_async(as_numpy=True) are available for audio results when you want the downloaded file loaded back into a NumPy array at the file's original sample rate.
Manual Scenario Runner
For an end-to-end local verification that uploads, waits, downloads, and writes JSON summaries into results_test/, run:
CLEANVOICE_API_KEY=your-api-key python examples/manual_test_showcase.py
You can target specific recipes with repeated --scenario flags such as audio_studio_sound_only, audio_all_inclusive, video_defaults, or video_all_inclusive.
Complete Workflow
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
# Upload, process, and download in one line
result, output_file = client.process_and_download(
"input_audio.mp3", # Local file (automatically uploaded)
"enhanced_output.mp3", # Output filename
fillers=True,
normalize=True,
summarize=True,
)
get_edit(edit_id)
Get the status and results of an edit job.
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
edit = client.get_edit(edit_id)
if edit.status == 'SUCCESS':
print(f'Download URL: {edit.result.download_url}')
else:
print(f'Status: {edit.status}') # PENDING, STARTED, RETRY, FAILURE
check_auth()
Verify API authentication and get account information.
from cleanvoice import Cleanvoice
client = Cleanvoice.from_env()
account = client.check_auth()
print('Account info:', account)
Returns a typed mapping with common fields such as user, account_type, and credits_remaining, while preserving any extra account data returned by the API.
Async Support
Use AsyncCleanvoice for async applications:
import asyncio
from cleanvoice import AsyncCleanvoice
async def main():
async with AsyncCleanvoice.from_env() as client:
result = await client.process(
"https://example.com/audio.mp3",
normalize=True,
studio_sound=True,
output_path="async_output.wav",
)
print(result.audio.local_path)
asyncio.run(main())
Local Media Utilities
The SDK includes local audio/video helper utilities. These helpers do not require FFmpeg.
Audio File Information
from cleanvoice import get_audio_info
info = get_audio_info('path/to/audio.mp3')
print(f"Duration: {info.duration}s")
print(f"Sample Rate: {info.sample_rate}Hz")
print(f"Channels: {info.channels}")
Video File Information
from cleanvoice import get_video_info
info = get_video_info('path/to/video.mp4')
print(f"Duration: {info.duration}s")
print(f"Resolution: {info.width}x{info.height}")
print(f"FPS: {info.fps}")
print(f"Has Audio: {info.has_audio}")
Extract Audio from Video
from cleanvoice import extract_audio_from_video
audio_path = extract_audio_from_video(
'path/to/video.mp4',
'extracted_audio.wav' # Optional output path
)
print(f"Extracted audio: {audio_path}")
Configuration Options
Audio Processing
| Option | Type | Default | Description |
|---|---|---|---|
fillers |
bool | False | Remove filler sounds (um, uh, etc.) |
stutters |
bool | False | Remove stutters |
long_silences |
bool | False | Remove long silences |
mouth_sounds |
bool | False | Remove mouth sounds |
hesitations |
bool | False | Remove hesitations |
breath |
bool or str | False | Reduce breath sounds |
remove_noise |
bool | True | Remove background noise |
keep_music |
bool | False | Preserve music sections |
normalize |
bool | False | Normalize audio levels |
studio_sound |
bool or str | False | AI-powered enhancement |
Output Options
| Option | Type | Default | Description |
|---|---|---|---|
export_format |
str | 'auto' | Output format: auto, mp3, wav, flac, m4a |
mute_lufs |
float | -80 | Mute threshold in LUFS (negative) |
target_lufs |
float | -16 | Target loudness in LUFS (negative) |
export_timestamps |
bool | False | Export edit timestamps |
AI Features
| Option | Type | Default | Description |
|---|---|---|---|
transcription |
bool | False | Generate speech-to-text |
summarize |
bool | False | Generate AI summary. The SDK auto-enables transcription. |
social_content |
bool | False | Optimize for social media. The SDK auto-enables summarize. |
Other Options
| Option | Type | Default | Description |
|---|---|---|---|
video |
bool | auto-detected | Process video file |
merge |
bool | False | Merge multi-track audio |
send_email |
bool | False | Email results to account |
Examples
Basic Audio Cleaning
from cleanvoice import Cleanvoice
cv = Cleanvoice.from_env()
result = cv.process(
"https://example.com/podcast.mp3",
fillers=True,
long_silences=True,
normalize=True,
remove_noise=True,
)
print(f"Cleaned audio: {result.audio.url}")
print(f"Removed {result.audio.statistics.FILLER_SOUND} filler sounds")
Transcription and Summary
from cleanvoice import Cleanvoice
cv = Cleanvoice.from_env()
result = cv.process(
"https://example.com/interview.wav",
summarize=True,
normalize=True,
)
print('Title:', result.transcript.title)
print('Summary:', result.transcript.summary)
print('Chapters:', result.transcript.chapters)
Video Processing
from cleanvoice import Cleanvoice
cv = Cleanvoice.from_env()
result = cv.process(
"https://example.com/video.mp4",
studio_sound=True,
remove_noise=True,
transcription=True,
output_path='processed_video.mp4',
)
print('Returned media type:', 'video' if result.is_video else 'audio')
print('Processed file:', result.media.url)
print('Saved locally:', result.media.local_path)
When the SDK sees a video extension such as .mp4, it auto-forces video=True and emits a warning so callers know the returned asset will stay a video file.
Batch Processing
from cleanvoice import Cleanvoice
import time
cv = Cleanvoice.from_env()
files = [
"https://example.com/episode1.mp3",
"https://example.com/episode2.mp3",
"https://example.com/episode3.mp3"
]
edit_ids = []
for file in files:
edit_id = cv.create_edit(file, fillers=True, normalize=True)
edit_ids.append(edit_id)
# Poll for completion
results = []
for edit_id in edit_ids:
while True:
edit = cv.get_edit(edit_id)
if edit.status == 'SUCCESS':
results.append(edit)
break
elif edit.status == 'FAILURE':
print(f"Failed: {edit_id}")
break
else:
time.sleep(5) # Wait 5 seconds before polling again
print(f'All processing completed: {len(results)} files')
Error Handling
from cleanvoice import Cleanvoice, ApiError, FileValidationError
cv = Cleanvoice.from_env()
try:
result = cv.process(
"https://example.com/audio.mp3",
fillers=True,
normalize=True,
)
print('Success:', result.audio.url)
except ApiError as e:
print(f'API Error: {e.message}')
if e.status_code:
print(f'HTTP Status: {e.status_code}')
print(f'Error Code: {e.error_code}')
except FileValidationError as e:
print(f'File Error: {e}')
except Exception as e:
print(f'Unexpected Error: {e}')
Supported File Formats
Audio Formats
- WAV (.wav)
- MP3 (.mp3)
- OGG (.ogg)
- FLAC (.flac)
- M4A (.m4a)
- AIFF (.aiff)
- AAC (.aac)
Video Formats
- MP4 (.mp4)
- MOV (.mov)
- WebM (.webm)
- AVI (.avi)
- MKV (.mkv)
Requirements
- Python 3.8+
- FFmpeg is not required for the SDK's local media helper utilities
Development
Installing for Development
git clone https://github.com/cleanvoice/cleanvoice-python-sdk
cd cleanvoice-python-sdk
pip install -e .
Running Tests
pytest
Code Formatting
black src/
isort src/
Type Checking
mypy src/
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
License
MIT License - see LICENSE file for details.
Support
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cleanvoice_sdk-2.0.0.tar.gz.
File metadata
- Download URL: cleanvoice_sdk-2.0.0.tar.gz
- Upload date:
- Size: 49.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f353d1e87edac2c8ff336f278fcadc487c7b4b14f3c561a355b796ef7fd83671
|
|
| MD5 |
2569354877a670472c7370a2c66510ab
|
|
| BLAKE2b-256 |
d77e316593a9730f9892a785330b1f9ee25df1f07fa7dd013a43889715159bb7
|
File details
Details for the file cleanvoice_sdk-2.0.0-py3-none-any.whl.
File metadata
- Download URL: cleanvoice_sdk-2.0.0-py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bbad07a0b821ebc034a6ddc5872a4c140215c6d96f912b4e463e0dff8de93c0
|
|
| MD5 |
083e8c63474ac8673abe717832d42939
|
|
| BLAKE2b-256 |
1f863f474ddf639d1047ed6dc2e41aeb4b5430fe517364bfccd8d09a78463095
|