HW accelerated video reading for ML Inference (CUDA version).

These details have not been verified by PyPI

Project links

Homepage

Project description

CeLux

CeLux is a high-performance Python library for video processing, leveraging the power of FFmpeg. It delivers some of the fastest decode times for full HD videos globally, enabling efficient and seamless video decoding directly into PyTorch tensors.

The name CeLux is derived from the Latin words celer (speed) and lux (light), reflecting its commitment to speed and efficiency.

🚀 Features

⚡ Ultra-Fast Video Decoding: Achieve lightning-fast decode times for full HD videos using hardware acceleration.
🔗 Direct Decoding to Tensors: Decode video frames directly into PyTorch tensors for immediate processing.
🖥️ Hardware Acceleration Support: Utilize CUDA for GPU-accelerated decoding, significantly improving performance.
🔄 Easy Integration: Seamlessly integrates with existing Python workflows, making it easy to incorporate into your projects.

📦 Installation

CeLux offers two installation options tailored to your system's capabilities:

CPU-Only Version: For systems without CUDA-capable GPUs.
CUDA (GPU) Version: For systems with NVIDIA GPUs supporting CUDA.

🖥️ CPU-Only Installation

Install the CPU version of CeLux using pip:

pip install celux

Note: The CPU version only supports CPU operations. Attempting to use GPU features with this version will result in an error.

🖥️ CUDA (GPU) Installation

Install the CUDA version of CeLux using pip:

pip install celux-cuda

Note: The CUDA version requires a CUDA-capable GPU and the corresponding Torch-Cuda installation.

🔄 Both Packages Import as `celux`

Regardless of the installation choice, both packages are imported using the same module name:

import celux #as cx

This design ensures a seamless transition between CPU and CUDA versions without changing your import statements.

📚 Getting Started

🎉 Quick Start

Here's a simple example demonstrating how to use CeLux to read video frames and process them:

import celux as cx

def process_frame(frame):
    # Implement your frame processing logic here
    pass

# Choose device based on your installation
device = "cuda" if torch.cuda.is_available() else "cpu"

with cx.VideoReader(
    "path/to/input/video.mp4",
    device=device     # "cpu" or "cuda"
) as reader:
    for frame in reader:
        # Frame is a PyTorch tensor in HWC format
        process_frame(frame)

Parameters:

device (str): Device to use. Can be "cpu" or "cuda".

📜 Detailed Usage

CeLux allows you to efficiently decode and process video frames with ease. Below are some common operations:

Initialize VideoReader

reader = cx.VideoReader(
    "path/to/video.mp4",
    device="cuda",        # Use "cpu" or "cuda"
)

Iterate Through Frames

for frame in reader:
    # Your processing logic
    pass

Access Video Properties

properties = reader.get_properties()
print(properties)

🛠️ Building from Source

While CeLux is easily installable via pip, you might want to build it from source for customization or contributing purposes.

Clone the Repository:

git clone https://github.com/Trentonom0r3/celux.git
cd celux

Install Dependencies:

Ensure all prerequisites are installed. You can use vcpkg for managing dependencies on Windows.

Configure the Project with CMake:

cmake -B build -S . -DCMAKE_BUILD_TYPE=Release

Windows Users: If using Vcpkg, include the toolchain file:

cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=<path_to_vcpkg>/scripts/buildsystems/vcpkg.cmake

Build the Project:
```
cmake --build build --config Release
```
Install the Package:
```
cmake --install build
```
Set Up Environment Variables:

Ensure FFmpeg binaries and other dependencies are in your system's PATH. On Unix systems, you might need to set LD_LIBRARY_PATH or DYLD_LIBRARY_PATH.

🤝 Contributing

We welcome contributions! Follow these steps to contribute:

Fork the Repository:

Click the "Fork" button at the top right of the repository page.

Clone Your Fork:

git clone https://github.com/your-username/celux.git
cd celux

Create a New Branch:

git checkout -b feature/your-feature-name

Make Your Changes:

Implement your feature or bugfix.

Commit Your Changes:

git commit -am "Add your commit message here"

Push to Your Fork:

git push origin feature/your-feature-name

Submit a Pull Request:

Go to the original repository and click on "Pull Requests," then "New Pull Request."

📈 Changelog

Version 0.4.1 (2024-10-24)

Refactor slightly, move tests/python into tests.
Added new test to download videos of various bitdepth and codec types.
Added new dictionary options to VideoReader.get_properties();
- codec: The name of the codec being used.
- bit_depth: The bit-depth of the video.

Version 0.4.0 (2024-10-23)

Moved to FFmpeg static libraries!
- Startup times are improved. All libs that can be static, are static.
Adjusted logging to flow a little bit better, not overcrowd console unless desired.
- Logging details more info on codecs. The Decoder selects the BEST codec for the video.
Need to investigate if NVDEC is bottlenecked, or I've reached max performance capabilities.
- It is curious that cpu benches at 1859 fps and gpu benches at 1809 fps.

Version 0.3.9 (2024-10-21)

Pre-Release Update:
- Prep for 0.4.0 release.
  - 0.4.x release will be characterized by new codec and pixel format support!
  - Removed d_type and buffer_size arguments from VideoReader and VideoWriter.
    - Output and Input tensors are now, by standard, UINT8, HWC format, [0,255].
  - Standardized to YUV420P for now.
  - Swapped custom CUDA kernels for nppi.
  - various cleanup and small refactorings.

Version 0.3.8 (2024-10-21)

Pre-Release Update:
- Removed Buffering from VideoWriter, resulting in INSANE performance gains.
- Fixed threading issue with VideoWriter, now properly utilizes available threads.
- Removed sync method from VideoWriter.
  - Synchronization can be manually handled by the user or by letting the VideoWriter do so on destruction.
- Updated Benchmarks to reflect new version.

Version 0.3.7 (2024-10-21)

Pre-Release Update:
- Fixed remaining issues with VideoWriter class.
  - Both cpu and cuda arguments NOW work properly.
- Few Small bug fixes regarding synchronization and memory management.

Version 0.3.6 (2024-10-19)

Pre-Release Update:
- Fixed VideoWriter class.
  - Both cpu and cuda arguments now work properly.
- Encoder Functionality:
  - Enabled encoder support for both CPU and CUDA backends.
  - Users can now encode videos directly from PyTorch tensors.
- Update Github Actions, add tests.

Version 0.3.5 (2024-10-19)

Pre-Release Update:
- (somewhat) Fixed VideoWriter class. Working on cuda for now, but cpu still has incorrect output.
- Added VideoWriter, and LogLevel definitions to .pyi stub file.
- Adjusted github actions to publish to pypi.

Version 0.3.4.1 (2024-10-19)

Pre-Release Update:
- Added logging utility for debugging purposes.
```
import celux
celux.set_log_level(celux.LogLevel.debug)
```

Version 0.3.3 (2024-10-19)

Pre-Release Update:
- Added buffer_size and stream arguments.
  - Choose Pre-Decoded Frame buffer size, and pass your own cuda stream.
- Some random cleanup and small refactorings.

Version 0.3.1 (2024-10-17)

Pre-Release Update:
- Adjusted Frame Range End in VideoReader to be exclusive to match cv2 behavior.
- Removed unnecessary error throws.
- Encoder Functionality: Now fully operational for both CPU and CUDA.

Version 0.3.0 (2024-10-17)

Pre-Release Update:
- Renamed from ffmpy to CeLux.
- Created official pypi release.
- Refactored to split cpu and cuda backends.

Version 0.2.6 (2024-10-15)

Pre-Release Update:
- Removed Numpy support in favor of PyTorch tensors with GPU/CPU support.
- Added NV12ToBGR, BGRToNV12, and NV12ToNV12 conversion modules.
- Fixed several minor issues.
- Updated documentation and examples.

Version 0.2.2 (2024-10-14)

Pre-Release Update:
- Fixed several minor issues.
- Made VideoReader and VideoWriter callable.
- Created BGR conversion modules.
- Added frame range (in/out) arguments.
```
with VideoReader('input.mp4')([10, 20]) as reader:
    for frame in reader:
        print(f"Processing frame {frame}")
```

Version 0.2.1 (2024-10-13)

Pre-Release Update:
- Adjusted Python bindings to use snake_case.
- Added .pyi stub files to .whl.
- Adjusted dtype arguments to (uint8, float32, float16).
- Added GitHub Actions for new releases.
- Added HW Accel Encoder support, direct encoding from numpy/tensors.
- Added has_audio property to VideoReader.get_properties().

Version 0.1.1 (2024-10-06)

Pre-Release Update:
- Implemented support for multiple data types (uint8, float, half).
- Provided example usage and basic documentation.

📄 License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the LICENSE file for details.

🙏 Acknowledgments

FFmpeg: The backbone of video processing in CeLux.
PyTorch: For tensor operations and CUDA support.
Vcpkg: Simplifies cross-platform dependency management.
@NevermindNilas: For assistance with testing, API suggestions, and more.

📈 Benchmarks

🖥️ System Specifications

Specification	Details
Processor	Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
Architecture	AMD64
Python Version	3.12.7 (CPython)
Python Build	tags/v3.12.7:0b05ead Oct 1 2024 03:06:41
Operating System	Windows 11
CPU Brand	12th Gen Intel(R) Core(TM) i7-12700H
CPU Frequency	2.3000 GHz
L2 Cache Size	11776 KB
L3 Cache Size	24576 KB
Number of Cores	20
GPU #1	NVIDIA GeForce RTX 3060 Laptop GPU (6.00 GB)

Benchmark	Mean Time (s)	Std Dev (s)	FPS
Test Video Reader Cpu Benchmark	7.70	0.05	1859.41
Test Video Reader Cuda Benchmark	7.91	0.02	1809.11
Test Video Writer Benchmark	33.15	0.61	431.88

📊 Benchmark Visualizations

FPS Comparison

Mean Time Comparison

❓ FAQ

Q: What video formats are supported?

A: CeLux aims to support all video formats and codecs supported by FFmpeg. However, hardware-accelerated decoding is currently available for specific codecs like H.264 and HEVC. These are the only codecs tested so far.

Q: How do I report a bug or request a feature?

A: Please open an issue on the GitHub Issues page with detailed information about the bug or feature request.

🚤 Roadmap

Support for Additional Codecs:
- Expand the range of supported video codecs.
Audio Processing:
- Introduce capabilities for audio extraction and processing.
Performance Enhancements:
- Further optimize decoding performance and memory usage.
Cross-Platform Support:
- Improve compatibility with different operating systems and hardware configurations.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.7

Jan 24, 2025

0.5.6.1

Jan 23, 2025

0.5.6

Jan 22, 2025

0.5.3

Nov 10, 2024

0.5.2

Nov 7, 2024

0.5.1.2

Nov 5, 2024

0.5.1.1

Nov 5, 2024

0.5.1

Nov 4, 2024

0.5.0

Nov 3, 2024

0.4.5.5

Oct 30, 2024

0.4.5

Oct 30, 2024

0.4.4

Oct 29, 2024

0.4.3.5

Oct 29, 2024

0.4.3

Oct 29, 2024

0.4.2

Oct 28, 2024

This version

0.4.1

Oct 24, 2024

0.4.0

Oct 24, 2024

0.3.9

Oct 23, 2024

0.3.8

Oct 22, 2024

0.3.7

Oct 21, 2024

0.3.6

Oct 20, 2024

0.3.5

Oct 19, 2024

0.3.4.1

Oct 19, 2024

0.3.4

Oct 19, 2024

0.3.3

Oct 19, 2024

0.3.1

Oct 17, 2024

0.3.0

Oct 17, 2024

0.2.9

Oct 17, 2024

0.2.8

Oct 17, 2024

0.2.7

Oct 17, 2024

0.2.6

Oct 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

celux_cuda-0.4.1-py3-none-any.whl (12.6 MB view details)

Uploaded Oct 24, 2024 Python 3

File details

Details for the file celux_cuda-0.4.1-py3-none-any.whl.

File metadata

Download URL: celux_cuda-0.4.1-py3-none-any.whl
Upload date: Oct 24, 2024
Size: 12.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for celux_cuda-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c9ce13e3d6c65118a36ed032a9fe357b66d9b83b23eaceeedfcf7dc57b0a9f2f`
MD5	`d9152647a236534b9185caca6c67243a`
BLAKE2b-256	`3731d271480da7e79a05820af3510f2905592680c4deb4e03e5f03d404b7eee0`

See more details on using hashes here.

celux-cuda 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CeLux

🚀 Features

📦 Installation

🖥️ CPU-Only Installation

🖥️ CUDA (GPU) Installation

🔄 Both Packages Import as celux

📚 Getting Started

🎉 Quick Start

📜 Detailed Usage

Initialize VideoReader

Iterate Through Frames

Access Video Properties

🛠️ Building from Source

🤝 Contributing

📈 Changelog

Version 0.4.1 (2024-10-24)

Version 0.4.0 (2024-10-23)

Version 0.3.9 (2024-10-21)

Version 0.3.8 (2024-10-21)

Version 0.3.7 (2024-10-21)

Version 0.3.6 (2024-10-19)

Version 0.3.5 (2024-10-19)

Version 0.3.4.1 (2024-10-19)

Version 0.3.3 (2024-10-19)

Version 0.3.1 (2024-10-17)

Version 0.3.0 (2024-10-17)

Version 0.2.6 (2024-10-15)

Version 0.2.2 (2024-10-14)

Version 0.2.1 (2024-10-13)

Version 0.1.1 (2024-10-06)

📄 License

🙏 Acknowledgments

📈 Benchmarks

🖥️ System Specifications

📊 Benchmark Visualizations

❓ FAQ

Q: What video formats are supported?

Q: How do I report a bug or request a feature?

🚤 Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

🔄 Both Packages Import as `celux`