Web-ready standardized file processing and serialization. Read, load and convert to standard file types with a common interface.
Project description
MediaToolkit
Ultra-Fast Python Media Processing • FFmpeg • OpenCV • PyAV
⚡ Lightning-fast • 🛠️ Simple API • 🔄 Any Format • 🌐 Web-ready • 🖥️ Cross-platform
MediaToolkit is a high-performance Python library for processing images, audio, and video with a unified, developer-friendly API. Built on FFmpeg (PyAV) and OpenCV for production-grade speed and reliability.
Perfect for: AI/ML pipelines, web services, batch processing, media automation, computer vision, and audio analysis.
📦 Installation
pip install media-toolkit
Note: Audio/video processing requires FFmpeg. PyAV usually installs it automatically, but if needed, install manually from ffmpeg.org.
⚡ Quick Start
One API for all media types - load from files, URLs, bytes, base64, or numpy arrays:
from media_toolkit import ImageFile, AudioFile, VideoFile, media_from_any
# load any file and convert it to the correct format. This works with smart content detection
audio = media_from_any("media/my_favorite_song.mp3") # -> AudioFile
# Load from any source
image = ImageFile().from_any("https://example.com/image.jpg")
audio = AudioFile().from_file("audio.wav")
video = VideoFile().from_file("video.mp4")
imb = ImageFile().from_base64("data:image/png;base64,...")
# Convert to any format
image_array = image.to_np_array() # → numpy array (H, W, C)
audio_array = audio.to_np_array() # → numpy array (samples, channels)
image_base64 = image.to_base64() # → base64 string
video_bytes = video.to_bytes_io() # → BytesIO object
Batch Processing
from media_toolkit import MediaList, AudioFile
# Process multiple files efficiently
audio_files = MediaList([
"song1.wav",
"https://example.com/song2.mp3",
b"raw_audio_bytes..."
])
for audio in audio_files:
audio.save(f"converted_{audio.file_name}.mp3") # Auto-convert on save
🖼️ Image Processing
OpenCV-powered image operations:
from media_toolkit import ImageFile
import cv2
# Load and process
img = ImageFile().from_any("image.png")
image_array = img.to_np_array() # → (H, W, C) uint8 array
# Apply transformations
flipped = cv2.flip(image_array, 0)
# Save processed image
ImageFile().from_np_array(flipped).save("flipped.jpg")
🎵 Audio Processing
FFmpeg/PyAV-powered audio operations:
from media_toolkit import AudioFile
# Load audio
audio = AudioFile().from_file("input.wav")
# Get numpy array for ML/analysis
audio_array = audio.to_np_array() # → (samples, channels) float32 in [-1, 1] range
# Inspect metadata
print(f"Sample rate: {audio.sample_rate} Hz; Channels: {audio.channels}; Duration: {audio.duration}")
# Format conversion (automatic re-encoding)
audio.save("output.mp3") # MP3
audio.save("output.flac") # FLAC (lossless)
audio.save("output.m4a") # AAC
# Create audio from numpy
new_audio = AudioFile().from_np_array(
audio_array,
sample_rate=audio.sample_rate,
audio_format="wav"
)
Supported formats: WAV, MP3, FLAC, AAC, M4A, OGG, Opus, WMA, AIFF
🎬 Video Processing
High-performance video operations:
from media_toolkit import VideoFile
import cv2
video = VideoFile().from_file("input.mp4")
# Extract audio track
audio = video.extract_audio("audio.mp3")
# Process frames
for i, frame in enumerate(video.to_stream()):
if i >= 300: # First 300 frames
break
# frame is numpy array (H, W, C)
processed = my_processing_function(frame)
cv2.imwrite(f"frame_{i:04d}.png", processed)
# Create video from images
images = [f"frame_{i:04d}.png" for i in range(300)]
modifiedVid = VideoFile().from_files(images, frame_rate=30, audio_file="audio.mp3")
🌐 Web & API Integration
Native FastTaskAPI Support
Built-in integration with FastTaskAPI for simplified file handling:
from fast_task_api import FastTaskAPI, ImageFile, VideoFile
app = FastTaskAPI()
@app.task_endpoint("/process")
def process_media(image: ImageFile, video: VideoFile) -> VideoFile:
# Automatic type conversion, validation
modified_video = my_ai_inference(image, video)
# any media can be returned automatically
return modified_video
FastAPI Integration
from fastapi import FastAPI, UploadFile, File
from media_toolkit import ImageFile
app = FastAPI()
@app.post("/process-image")
async def process_image(file: UploadFile = File(...)):
image = ImageFile().from_any(file)
HTTP Client Usage
import httpx
from media_toolkit import ImageFile
image = ImageFile().from_file("photo.jpg")
# Send to API
files = {"file": image.to_httpx_send_able_tuple()}
response = httpx.post("https://api.example.com/upload", files=files)
📋 Advanced Features
Container Classes
MediaList - Type-safe batch processing:
from media_toolkit import MediaList, ImageFile
images = MediaList[ImageFile]()
images.extend(["img1.jpg", "img2.png", "https://example.com/img3.jpg"])
# Lazy loading - files loaded on access
for img in images:
img.save(f"processed_{img.file_name}")
MediaDict - Key-value media storage:
from media_toolkit import MediaDict, ImageFile
media_db = MediaDict()
media_db["profile"] = "profile.jpg"
media_db["banner"] = "https://example.com/banner.png"
# Export to JSON
json_data = media_db.to_json()
Streaming for Large Files
# Memory-efficient processing
audio = AudioFile().from_file("large_audio.wav")
for chunk in audio.to_stream():
process_chunk(chunk) # Process in chunks
video = VideoFile().from_file("large_video.mp4")
stream = video.to_stream()
for frame in stream:
process_frame(frame) # Frame-by-frame processing
# video-to-audio-stream
for av_frame in stream.audio_frames():
pass
🚀 Performance
MediaToolkit leverages industry-standard libraries for maximum performance:
- FFmpeg (PyAV): Professional-grade audio/video codec support
- OpenCV: Optimized computer vision operations
- Streaming: Memory-efficient processing of large files
- Hardware acceleration: GPU support where available
Benchmarks:
- Audio conversion: ~100x faster than librosa/pydub
- Image processing: Near-native OpenCV speed
- Video processing: Hardware-accelerated encoding/decoding. FPS > 500 for video decoding on consumer grade hardware.
🔧 Key Features
✅ Universal input: Files, URLs, bytes, base64, numpy arrays, bytesio, starlette upload files, soundfile
✅ Automatic format detection: Smart content-type inference
✅ Seamless conversion: Change formats on save
✅ Type-safe: Full typing support with generics
✅ Web-ready: Native FastTaskAPI integration, extra features for httpx and fastapi
✅ Production-tested: Used in production AI/ML pipelines
📋 Format Support Overview
| Category | Formats | Integration | Class | Description |
|---|---|---|---|---|
| Images | jpg, jpeg, png, gif, bmp, tiff, tif, jfif, ico, webp, avif, heic, heif, svg |
Deep | ImageFile |
OpenCV-powered processing, format conversion, channel detection and more. |
| Audio | wav, mp3, ogg, flac, aac, m4a, wma, opus, aiff |
Deep | AudioFile |
FFmpeg/PyAV-powered, format conversions, sample rate conversion, streaming, metadata extraction. |
| Video | mp4, avi, mov, mkv, webm, flv, wmv, 3gp, ogv, m4v |
Deep | VideoFile |
Hardware-accelerated encoding/decoding, frame extraction, audio extraction. |
| 3D Models | obj, glb, gltf, dae, fbx, 3ds, ply, stl, step, iges, x3d, blend |
Shallow | MediaFile |
Basic file handling, no specialized 3D processing yet. |
| Documents | pdf, txt, html, htm, json, js, css, xml, csv |
Shallow | MediaFile |
Text and document formats, basic file operations |
| Archives | zip, 7z, tar, gz |
Shallow | MediaFile |
Archive and compressed file formats. Basic file operations. |
| Data | npy, npz, pkl, pickle |
Shallow | MediaFile |
Python data serialization formats. Basic file operations. |
Deep Integration: Specialized classes with advanced processing capabilities, format conversion, and media-specific operations.
Shallow Integration: Basic MediaFile class with universal file operations, automatic format detection, and standard conversions.
🤝 Contributing
We welcome contributions! Key areas:
- Performance optimizations
- New format support
- Documentation & examples
- Test coverage
- Platform-specific enhancements
📄 License
MIT License - see LICENSE for details.
Join the intelligence revolution. Join socaity.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file media_toolkit-0.2.20.tar.gz.
File metadata
- Download URL: media_toolkit-0.2.20.tar.gz
- Upload date:
- Size: 56.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02411111ebe45e94926cd66867d6fbbff14aade13850c642d6567133c89cec45
|
|
| MD5 |
27e338b26a8fa0c689e2b5429dcc6645
|
|
| BLAKE2b-256 |
8a69dca2e49ddcfe806439ce89c0c7facb263776e4ed2496652a6f6791556ce1
|
File details
Details for the file media_toolkit-0.2.20-py3-none-any.whl.
File metadata
- Download URL: media_toolkit-0.2.20-py3-none-any.whl
- Upload date:
- Size: 63.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06c132d93eb8f29e19ed5451196efac7b2554b54246a64db8ea681cb7334194a
|
|
| MD5 |
709531c1df4fbbfd6c7c468c557330f9
|
|
| BLAKE2b-256 |
da234ff244169337c3c82bc10783754f21fb562aa793ea9ec9a861026c1cf8e0
|