media-toolkit

Web-ready standardized file processing and serialization. Read, load and convert to standard file types with a common interface.

These details have not been verified by PyPI

Project links

Project description

MediaToolkit. One file API. Any media.

One file API. Any media. Fast, typed, web-ready.

Load image, audio, and video from files, URLs, bytes, base64, or numpy, convert on save.
Built on FFmpeg (PyAV) and OpenCV for production-grade speed.

Install

pip install media-toolkit

Audio and video processing requires FFmpeg. PyAV usually installs it automatically. If needed, install it manually from ffmpeg.org.

Quick start

One API for all media types. Load from files, URLs, bytes, base64, or numpy arrays:

from media_toolkit import ImageFile, AudioFile, VideoFile, media_from_any

# Load any file and convert it to the correct type, with smart content detection
audio = media_from_any("media/my_favorite_song.mp3")  # returns AudioFile

# Load from any source
image = ImageFile().from_any("https://example.com/image.jpg")
audio = AudioFile().from_file("audio.wav")
video = VideoFile().from_file("video.mp4")
img = ImageFile().from_base64("data:image/png;base64,...")

# Convert to any format
image_array = image.to_np_array()      # numpy array (H, W, C)
audio_array = audio.to_np_array()      # numpy array (samples, channels)
image_base64 = image.to_base64()       # base64 string
video_bytes = video.to_bytes_io()      # BytesIO object

Batch processing

from media_toolkit import MediaList, AudioFile

# Process multiple files efficiently
audio_files = MediaList([
    "song1.wav",
    "https://example.com/song2.mp3",
    b"raw_audio_bytes..."
])

for audio in audio_files:
    audio.save(f"converted_{audio.file_name}.mp3")  # Auto-convert on save

Image processing

OpenCV-powered image operations:

from media_toolkit import ImageFile
import cv2

# Load and process
img = ImageFile().from_any("image.png")
image_array = img.to_np_array()  # (H, W, C) uint8 array

# Apply transformations
flipped = cv2.flip(image_array, 0)

# Save processed image
ImageFile().from_np_array(flipped).save("flipped.jpg")

Audio processing

FFmpeg/PyAV-powered audio operations:

from media_toolkit import AudioFile

# Load audio
audio = AudioFile().from_file("input.wav")

# Get numpy array for ML and analysis
audio_array = audio.to_np_array()  # (samples, channels) float32 in [-1, 1] range

# Inspect metadata
print(f"Sample rate: {audio.sample_rate} Hz; Channels: {audio.channels}; Duration: {audio.duration}")

# Format conversion (automatic re-encoding)
audio.save("output.mp3")   # MP3
audio.save("output.flac")  # FLAC (lossless)
audio.save("output.m4a")   # AAC

# Create audio from numpy
new_audio = AudioFile().from_np_array(
    audio_array,
    sample_rate=audio.sample_rate,
    audio_format="wav"
)

Supported formats: WAV, MP3, FLAC, AAC, M4A, OGG, Opus, WMA, AIFF.

Video processing

High-performance video operations:

from media_toolkit import VideoFile
import cv2

video = VideoFile().from_file("input.mp4")

# Extract audio track
audio = video.extract_audio("audio.mp3")

# Process frames
for i, frame in enumerate(video.to_stream()):
    if i >= 300:  # First 300 frames
        break
    # frame is a numpy array (H, W, C)
    processed = my_processing_function(frame)
    cv2.imwrite(f"frame_{i:04d}.png", processed)

# Create video from images
images = [f"frame_{i:04d}.png" for i in range(300)]
modified_video = VideoFile().from_files(images, frame_rate=30, audio_file="audio.mp3")

Web and API integration

Native APIPod Support

Built-in integration with APIPod for simplified file handling:

from apipod import APIPod, ImageFile, VideoFile

app = APIPod()

@app.endpoint("/process")
def process_media(image: ImageFile, video: VideoFile) -> VideoFile:
    # Automatic type conversion and validation
    modified_video = my_ai_inference(image, video)
    # Any media can be returned automatically
    return modified_video

FastAPI integration

from fastapi import FastAPI, UploadFile, File
from media_toolkit import ImageFile

app = FastAPI()

@app.post("/process-image")
async def process_image(file: UploadFile = File(...)):
    image = ImageFile().from_any(file)

HTTP client usage

import httpx
from media_toolkit import ImageFile

image = ImageFile().from_file("photo.jpg")

# Send to API
files = {"file": image.to_httpx_send_able_tuple()}
response = httpx.post("https://api.example.com/upload", files=files)

Advanced features

Container classes

MediaList for type-safe batch processing:

from media_toolkit import MediaList, ImageFile

images = MediaList[ImageFile]()
images.extend(["img1.jpg", "img2.png", "https://example.com/img3.jpg"])

# Lazy loading, files are loaded on access
for img in images:
    img.save(f"processed_{img.file_name}")

MediaDict for key-value media storage:

from media_toolkit import MediaDict, ImageFile

media_db = MediaDict()
media_db["profile"] = "profile.jpg"
media_db["banner"] = "https://example.com/banner.png"

# Export to JSON
json_data = media_db.to_json()

Streaming for large files

# Memory-efficient processing
audio = AudioFile().from_file("large_audio.wav")
for chunk in audio.to_stream():
    process_chunk(chunk)  # Process in chunks

video = VideoFile().from_file("large_video.mp4")
stream = video.to_stream()
for frame in stream:
    process_frame(frame)  # Frame-by-frame processing

# Video-to-audio stream
for av_frame in stream.audio_frames():
    pass

Performance

MediaToolkit leverages industry-standard libraries for maximum performance:

FFmpeg (PyAV): professional-grade audio and video codec support
OpenCV: optimized computer vision operations
Streaming: memory-efficient processing of large files
Hardware acceleration: GPU support where available

Benchmarks:

Audio conversion: roughly 100x faster than librosa and pydub
Image processing: near-native OpenCV speed
Video processing: hardware-accelerated encoding and decoding, over 500 FPS for video decoding on consumer-grade hardware

Key features

Universal input: files, URLs, bytes, base64, numpy arrays, BytesIO, Starlette upload files, soundfile
Automatic format detection: smart content-type inference
Seamless conversion: change formats on save
Type-safe: full typing support with generics
Web-ready: native FastTaskAPI integration, plus extras for httpx and FastAPI
Production-tested: used in production AI and ML pipelines

Format support overview

Category	Formats	Integration	Class	Description
Images	`jpg`, `jpeg`, `png`, `gif`, `bmp`, `tiff`, `tif`, `jfif`, `ico`, `webp`, `avif`, `heic`, `heif`, `svg`	Deep	`ImageFile`	OpenCV-powered processing, format conversion, channel detection and more.
Audio	`wav`, `mp3`, `ogg`, `flac`, `aac`, `m4a`, `wma`, `opus`, `aiff`	Deep	`AudioFile`	FFmpeg/PyAV-powered, format conversion, sample rate conversion, streaming, metadata extraction.
Video	`mp4`, `avi`, `mov`, `mkv`, `webm`, `flv`, `wmv`, `3gp`, `ogv`, `m4v`	Deep	`VideoFile`	Hardware-accelerated encoding/decoding, frame extraction, audio extraction.
3D Models	`obj`, `glb`, `gltf`, `dae`, `fbx`, `3ds`, `ply`, `stl`, `step`, `iges`, `x3d`, `blend`	Shallow	`MediaFile`	Basic file handling, no specialized 3D processing yet.
Documents	`pdf`, `txt`, `html`, `htm`, `json`, `js`, `css`, `xml`, `csv`	Shallow	`MediaFile`	Text and document formats, basic file operations.
Archives	`zip`, `7z`, `tar`, `gz`	Shallow	`MediaFile`	Archive and compressed file formats, basic file operations.
Data	`npy`, `npz`, `pkl`, `pickle`	Shallow	`MediaFile`	Python data serialization formats, basic file operations.

Deep integration: specialized classes with advanced processing, format conversion, and media-specific operations.

Shallow integration: basic MediaFile class with universal file operations, automatic format detection, and standard conversions.

Contributing

Contributions are welcome. Key areas:

Performance optimizations
New format support
Documentation and examples
Test coverage
Platform-specific enhancements

License

MIT License, see LICENSE for details.

Made with ❤️ by SocAIty

Remember: Existence is pain to a Meseex, but task completion brings them joy!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.23

Jul 3, 2026

0.2.22

Nov 24, 2025

0.2.21

Nov 15, 2025

0.2.20

Nov 15, 2025

0.2.19

Oct 29, 2025

0.2.18

Oct 2, 2025

0.2.17

Sep 18, 2025

0.2.16

Sep 12, 2025

0.2.15

Sep 11, 2025

0.2.13

Sep 6, 2025

0.2.12

Sep 6, 2025

0.2.11

Sep 6, 2025

0.2.11.dev0 pre-release

Sep 5, 2025

0.2.9

Aug 12, 2025

0.2.8

Jul 2, 2025

0.2.7

Jun 26, 2025

0.2.6

Jun 18, 2025

0.2.5

Jun 18, 2025

0.2.4

Jun 10, 2025

0.2.3

Apr 1, 2025

0.2.2

Mar 23, 2025

0.2.1

Feb 14, 2025

0.2.0

Feb 12, 2025

0.1.9

Feb 10, 2025

0.1.8

Feb 6, 2025

0.1.7

Feb 5, 2025

0.1.6

Jan 31, 2025

0.1.5

Jan 23, 2025

0.1.3

Jan 22, 2025

0.1.2

Jan 22, 2025

0.1.1

Aug 28, 2024

0.1.1.dev1 pre-release

Aug 27, 2024

0.0.9

Jul 29, 2024

0.0.8

Jul 29, 2024

0.0.6

Jul 16, 2024

0.0.5

Jul 11, 2024

0.0.3

Jul 8, 2024

0.0.2

Jul 4, 2024

0.0.1

Jul 2, 2024

0.0.0

Jun 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

media_toolkit-0.2.23.tar.gz (55.9 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

media_toolkit-0.2.23-py3-none-any.whl (63.3 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file media_toolkit-0.2.23.tar.gz.

File metadata

Download URL: media_toolkit-0.2.23.tar.gz
Upload date: Jul 3, 2026
Size: 55.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for media_toolkit-0.2.23.tar.gz
Algorithm	Hash digest
SHA256	`72581b33562bae5c9a06fa273fc0cad64693cc4ebdb639781b0ef3da41dfe0a5`
MD5	`667f80c864d1a3fb3ac70f86f0b8db2a`
BLAKE2b-256	`a65ea2e943233850d552fda28e884b55ff042ec7afa2d2ec5b1dcd91135c74cc`

See more details on using hashes here.

File details

Details for the file media_toolkit-0.2.23-py3-none-any.whl.

File metadata

Download URL: media_toolkit-0.2.23-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 63.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for media_toolkit-0.2.23-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2c7f9fd5f5895cf9c324ee7a20c77f487b826ebc966e17994e229e28ed12fb26`
MD5	`0ca95ffe6319e0c0d08d11c1fe236700`
BLAKE2b-256	`cf30f329cff8e7fce6412e6a93d6c3dd1d4a5299d4445a6c2784d9e84390856a`

See more details on using hashes here.

media-toolkit 0.2.23

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

One file API. Any media. Fast, typed, web-ready.

Install

Quick start

Batch processing

Image processing

Audio processing

Video processing

Web and API integration

Native APIPod Support

FastAPI integration

HTTP client usage

Advanced features

Container classes

Streaming for large files

Performance

Key features

Format support overview

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes