Skip to main content

Web-ready standardized file processing and serialization. Read, load and convert to standard file types with a common interface.

Project description

MediaToolkit

Ultra-Fast Python Media Processing • FFmpeg • OpenCV • PyAV

⚡ Lightning-fast • 🛠️ Simple API • 🔄 Any Format • 🌐 Web-ready • 🖥️ Cross-platform


MediaToolkit is a high-performance Python library for processing images, audio, and video with a unified, developer-friendly API. Built on FFmpeg (PyAV) and OpenCV for production-grade speed and reliability.

Perfect for: AI/ML pipelines, web services, batch processing, media automation, computer vision, and audio analysis.

📦 Installation

pip install media-toolkit

Note: Audio/video processing requires FFmpeg. PyAV usually installs it automatically, but if needed, install manually from ffmpeg.org.

⚡ Quick Start

One API for all media types - load from files, URLs, bytes, base64, or numpy arrays:

from media_toolkit import ImageFile, AudioFile, VideoFile, media_from_any

# load any file and convert it to the correct format. This works with smart content detection
audio = media_from_any("media/my_favorite_song.mp3") # -> AudioFile

# Load from any source
image = ImageFile().from_any("https://example.com/image.jpg")
audio = AudioFile().from_file("audio.wav")
video = VideoFile().from_file("video.mp4")
imb = ImageFile().from_base64("data:image/png;base64,...")
# Convert to any format
image_array = image.to_np_array()      # → numpy array (H, W, C)
audio_array = audio.to_np_array()      # → numpy array (samples, channels)
image_base64 = image.to_base64()       # → base64 string
video_bytes = video.to_bytes_io()      # → BytesIO object

Batch Processing

from media_toolkit import MediaList, AudioFile

# Process multiple files efficiently
audio_files = MediaList([
    "song1.wav",
    "https://example.com/song2.mp3",
    b"raw_audio_bytes..."
])

for audio in audio_files:
    audio.save(f"converted_{audio.file_name}.mp3")  # Auto-convert on save

🖼️ Image Processing

OpenCV-powered image operations:

from media_toolkit import ImageFile
import cv2

# Load and process
img = ImageFile().from_any("image.png")
image_array = img.to_np_array()  # → (H, W, C) uint8 array

# Apply transformations
flipped = cv2.flip(image_array, 0)

# Save processed image
ImageFile().from_np_array(flipped).save("flipped.jpg")

🎵 Audio Processing

FFmpeg/PyAV-powered audio operations:

from media_toolkit import AudioFile

# Load audio
audio = AudioFile().from_file("input.wav")

# Get numpy array for ML/analysis
audio_array = audio.to_np_array()  # → (samples, channels) float32 in [-1, 1] range

# Inspect metadata
print(f"Sample rate: {audio.sample_rate} Hz; Channels: {audio.channels}; Duration: {audio.duration}")

# Format conversion (automatic re-encoding)
audio.save("output.mp3")   # MP3
audio.save("output.flac")  # FLAC (lossless)
audio.save("output.m4a")   # AAC

# Create audio from numpy
new_audio = AudioFile().from_np_array(
    audio_array,
    sample_rate=audio.sample_rate,
    audio_format="wav"
)

Supported formats: WAV, MP3, FLAC, AAC, M4A, OGG, Opus, WMA, AIFF

🎬 Video Processing

High-performance video operations:

from media_toolkit import VideoFile
import cv2

video = VideoFile().from_file("input.mp4")

# Extract audio track
audio = video.extract_audio("audio.mp3")

# Process frames
for i, frame in enumerate(video.to_stream()):
    if i >= 300:  # First 300 frames
        break
    # frame is numpy array (H, W, C)
    processed = my_processing_function(frame)
    cv2.imwrite(f"frame_{i:04d}.png", processed)

# Create video from images
images = [f"frame_{i:04d}.png" for i in range(300)]
modifiedVid = VideoFile().from_files(images, frame_rate=30, audio_file="audio.mp3")

🌐 Web & API Integration

Native FastTaskAPI Support

Built-in integration with FastTaskAPI for simplified file handling:

from fast_task_api import FastTaskAPI, ImageFile, VideoFile

app = FastTaskAPI()

@app.task_endpoint("/process")
def process_media(image: ImageFile, video: VideoFile) -> VideoFile:
    # Automatic type conversion, validation
    modified_video = my_ai_inference(image, video)
    # any media can be returned automatically
    return modified_video

FastAPI Integration

from fastapi import FastAPI, UploadFile, File
from media_toolkit import ImageFile

app = FastAPI()

@app.post("/process-image")
async def process_image(file: UploadFile = File(...)):
    image = ImageFile().from_any(file)

HTTP Client Usage

import httpx
from media_toolkit import ImageFile

image = ImageFile().from_file("photo.jpg")

# Send to API
files = {"file": image.to_httpx_send_able_tuple()}
response = httpx.post("https://api.example.com/upload", files=files)

📋 Advanced Features

Container Classes

MediaList - Type-safe batch processing:

from media_toolkit import MediaList, ImageFile

images = MediaList[ImageFile]()
images.extend(["img1.jpg", "img2.png", "https://example.com/img3.jpg"])

# Lazy loading - files loaded on access
for img in images:
    img.save(f"processed_{img.file_name}")

MediaDict - Key-value media storage:

from media_toolkit import MediaDict, ImageFile

media_db = MediaDict()
media_db["profile"] = "profile.jpg"
media_db["banner"] = "https://example.com/banner.png"

# Export to JSON
json_data = media_db.to_json()

Streaming for Large Files

# Memory-efficient processing
audio = AudioFile().from_file("large_audio.wav")
for chunk in audio.to_stream():
    process_chunk(chunk)  # Process in chunks

video = VideoFile().from_file("large_video.mp4")
stream = video.to_stream()
for frame in stream:
    process_frame(frame)  # Frame-by-frame processing

# video-to-audio-stream
for av_frame in stream.audio_frames():
    pass

🚀 Performance

MediaToolkit leverages industry-standard libraries for maximum performance:

  • FFmpeg (PyAV): Professional-grade audio/video codec support
  • OpenCV: Optimized computer vision operations
  • Streaming: Memory-efficient processing of large files
  • Hardware acceleration: GPU support where available

Benchmarks:

  • Audio conversion: ~100x faster than librosa/pydub
  • Image processing: Near-native OpenCV speed
  • Video processing: Hardware-accelerated encoding/decoding. FPS > 500 for video decoding on consumer grade hardware.

🔧 Key Features

Universal input: Files, URLs, bytes, base64, numpy arrays, bytesio, starlette upload files, soundfile
Automatic format detection: Smart content-type inference
Seamless conversion: Change formats on save
Type-safe: Full typing support with generics
Web-ready: Native FastTaskAPI integration, extra features for httpx and fastapi
Production-tested: Used in production AI/ML pipelines

📋 Format Support Overview

Category Formats Integration Class Description
Images jpg, jpeg, png, gif, bmp, tiff, tif, jfif, ico, webp, avif, heic, heif, svg Deep ImageFile OpenCV-powered processing, format conversion, channel detection and more.
Audio wav, mp3, ogg, flac, aac, m4a, wma, opus, aiff Deep AudioFile FFmpeg/PyAV-powered, format conversions, sample rate conversion, streaming, metadata extraction.
Video mp4, avi, mov, mkv, webm, flv, wmv, 3gp, ogv, m4v Deep VideoFile Hardware-accelerated encoding/decoding, frame extraction, audio extraction.
3D Models obj, glb, gltf, dae, fbx, 3ds, ply, stl, step, iges, x3d, blend Shallow MediaFile Basic file handling, no specialized 3D processing yet.
Documents pdf, txt, html, htm, json, js, css, xml, csv Shallow MediaFile Text and document formats, basic file operations
Archives zip, 7z, tar, gz Shallow MediaFile Archive and compressed file formats. Basic file operations.
Data npy, npz, pkl, pickle Shallow MediaFile Python data serialization formats. Basic file operations.

Deep Integration: Specialized classes with advanced processing capabilities, format conversion, and media-specific operations.

Shallow Integration: Basic MediaFile class with universal file operations, automatic format detection, and standard conversions.

🤝 Contributing

We welcome contributions! Key areas:

  • Performance optimizations
  • New format support
  • Documentation & examples
  • Test coverage
  • Platform-specific enhancements

📄 License

MIT License - see LICENSE for details.


Join the intelligence revolution. Join socaity.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

media_toolkit-0.2.22.tar.gz (57.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

media_toolkit-0.2.22-py3-none-any.whl (63.7 kB view details)

Uploaded Python 3

File details

Details for the file media_toolkit-0.2.22.tar.gz.

File metadata

  • Download URL: media_toolkit-0.2.22.tar.gz
  • Upload date:
  • Size: 57.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for media_toolkit-0.2.22.tar.gz
Algorithm Hash digest
SHA256 a269e51a852b5176877e8293a9e8c3a78c426345e738f21c0fb8ca84224a824b
MD5 c0aca091a5fee4b67ea753db76c4e34b
BLAKE2b-256 72119e57faf6347a6ebd7b0de0053599de9e7c6ab73ae9e78a8e84e0ac1e3607

See more details on using hashes here.

File details

Details for the file media_toolkit-0.2.22-py3-none-any.whl.

File metadata

  • Download URL: media_toolkit-0.2.22-py3-none-any.whl
  • Upload date:
  • Size: 63.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for media_toolkit-0.2.22-py3-none-any.whl
Algorithm Hash digest
SHA256 52d19e325c62d25a9764082cea6b9a0af07834db217ea6c2ba5b150f7d392b29
MD5 cef5bc9cba4fb883599ec1c214b265ff
BLAKE2b-256 428b12e44ac39af659a7846c29eac6a4b12f3043f520b29b32fed1cbac7cda28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page