Analyze video/image with machine learning methods, exif data, and other file based information.
Project description
Media Analyzer
Media Analyzer is a Python library designed to analyze media files, providing insights into their content and metadata. It supports various functionalities, including image classification, captioning, optical character recognition (OCR), and facial recognition.
Features
- Image Classification: Identify objects, activities, animals, and events present in images.
- Image Captioning: Generate descriptive captions for images using models like BLIP and LLM-based captioners.
- Optical Character Recognition (OCR): Extract text from images to identify documents, receipts, menus, and more.
- Facial Recognition: Detect faces in images and provide details such as age, sex, and facial landmarks.
Installation
To install Media Analyzer, use pip:
pip install media-analyzer
Requirements
You must have the following in PATH.
- ExifTool: https://exiftool.org/
- Tesseract OCR: https://tesseract-ocr.github.io/tessdoc/Installation.html
Usage
Here's a basic example of how to use Media Analyzer:
from media_analyzer import MediaAnalyzer
from pathlib import Path
analyzer = MediaAnalyzer()
media_file = Path("image.jpg")
result = analyzer.photo(media_file)
# Access analysis results
print(result.image_data)
print(result.frame_data)
Configuration
The AnalyzerSettings class allows you to customize various aspects of the analysis:
media_languages: List of languages for OCR to consider.
captions_provider: The provider for image captioning (e.g., 'BLIP', 'LLM').
enable_text_summary: Enable or disable text summarization.
enable_document_summary: Enable or disable document summarization.
document_detection_threshold: Confidence threshold for document detection.
face_detection_threshold: Confidence threshold for face detection.
enabled_file_modules: List of file modules to enable (e.g., exif data, gps, weather detection).
enabled_visual_modules: List of visual modules to enable (e.g., 'classification', 'captioning', 'ocr', 'facial_recognition').
Full docs can be found at https://ruurdbijlsma.github.io/media-analyzer/media_analyzer.html#MediaAnalyzer.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file media_analyzer-0.1.0.tar.gz.
File metadata
- Download URL: media_analyzer-0.1.0.tar.gz
- Upload date:
- Size: 28.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbb3aa75fa30caaa5313525d5f155168820a61f0aa202518542a0d87a8cd12b2
|
|
| MD5 |
387a1e50da95d5468bae55f997cdb4a1
|
|
| BLAKE2b-256 |
af521971cf144b9a33abaccbe6ca2ae80e72fb31eb4b597121c1e6cd7a5be8ac
|
Provenance
The following attestation bundles were made for media_analyzer-0.1.0.tar.gz:
Publisher:
publish-to-pypi.yml on RuurdBijlsma/media-analyzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
media_analyzer-0.1.0.tar.gz -
Subject digest:
dbb3aa75fa30caaa5313525d5f155168820a61f0aa202518542a0d87a8cd12b2 - Sigstore transparency entry: 163474719
- Sigstore integration time:
-
Permalink:
RuurdBijlsma/media-analyzer@7f3e6935bf79431729788a54927526f3efff78cf -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/RuurdBijlsma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@7f3e6935bf79431729788a54927526f3efff78cf -
Trigger Event:
push
-
Statement type:
File details
Details for the file media_analyzer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: media_analyzer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 66.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e8569393c0eda63088cd495a0b8ceb36fd8f3a0f9ba4d5a598bd4fe0f5b2a06
|
|
| MD5 |
0169306383999ce0364fb056be7d94ed
|
|
| BLAKE2b-256 |
2aba2bd736a8b84f1994cf8c9240c42ae6098655dea49a0ff2be8b93a63f0ab8
|
Provenance
The following attestation bundles were made for media_analyzer-0.1.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on RuurdBijlsma/media-analyzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
media_analyzer-0.1.0-py3-none-any.whl -
Subject digest:
7e8569393c0eda63088cd495a0b8ceb36fd8f3a0f9ba4d5a598bd4fe0f5b2a06 - Sigstore transparency entry: 163474729
- Sigstore integration time:
-
Permalink:
RuurdBijlsma/media-analyzer@7f3e6935bf79431729788a54927526f3efff78cf -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/RuurdBijlsma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@7f3e6935bf79431729788a54927526f3efff78cf -
Trigger Event:
push
-
Statement type: