CLI tool for OCR-ing video frames on macOS
Project description
ocrvid
CLI tool to extract text from videos using OCR on macOS.
[!NOTE] Currently, this tool only tested and works on macOS 13 or later.
[!CAUTION] This tool is still in early development stage. Current v0.x releases are not stable and may have breaking changes.
Installation
Install this tool using pip:
pip install ocrvid
Usage
Usage: ocrvid [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
detect Run OCR on a single picture, and print the results as json
langs Show supported recognition languages
props Show properties of video file
run Run OCR on a video, and save result as a json file
Run OCR on a video
Use ocr run sub command to run ocr on a video file:
Usage: ocrvid run [OPTIONS] INPUT_VIDEO
Run OCR on a video, and save result as a json file
Options:
-o, --output FILE Path to output json file. By default, if you run
`ocrvid run some/video.mp4` then the output file
will be `./video.json`
-fd, --frames-dir DIRECTORY If passed, then save video frames to this
directory. By default, frames are not saved.
-fs, --frame-step INTEGER Number of frames to skip between each frame to be
processed. By default, 100 which means every 100
frames, 1 frame will be processed.
-bs, --by-second FLOAT If passed, then process 1 frame every N seconds.
This option relies on fps metadata of the video.
-l, --langs TEXT Prefered languages to detect, ordered by
priority. See avalable languages run by `ocrvid
langs`. If not passed, language is auto detected.
--help Show this message and exit.
For example, run against the test video file at tests/video/pexels-eva-elijas.mp4 in this repo:
ocrvid run tests/video/pexels-eva-elijas.mp4
Then pexels-eva-elija.json is generated in the current directory which looks like this:
{
"video_file":"tests/video/pexels-eva-elijas.mp4",
"frames":[
{
"frame_index":0,
"results":[
{
"text":"INSPIRING WORDS",
"confidence":1.0,
"bbox":[
0.17844826551211515,
0.7961793736859821,
0.3419540405273438,
0.10085802570754931
]
},
{
"text":"\"Foar kills more dre",
"confidence":1.0,
"bbox":[
0.0724226723609706,
0.6839455987759758,
0.4780927975972494,
0.14592710683043575
]
},
{
"text":"than failure ever",
"confidence":1.0,
"bbox":[
0.018455287246445035,
0.6549868414269003,
0.45329265594482426,
0.14363905857426462
]
},
{
"text":"IZY KASSEM",
"confidence":0.5,
"bbox":[
-0.015967150208537523,
0.6675747977206025,
0.23065692583719888,
0.08114868486431293
]
},
{
"text":"Entrepreneur",
"confidence":1.0,
"bbox":[
0.01941176222542875,
0.1353812367971159,
0.9058370590209961,
0.26137274083956863
]
}
]
},
...
Show supported languages
You can run ocrvid langs to show supported languages to detect.
Results may change depending on running macos version.
On macOS version:
platform.mac_ver()[0]='14.2.1'
Result of ocrvid langs:
en-US
fr-FR
it-IT
de-DE
es-ES
pt-BR
zh-Hans
zh-Hant
yue-Hans
yue-Hant
ko-KR
ja-JP
ru-RU
uk-UA
th-TH
vi-VT
How can I run OCR on YouTube videos?
Take a look at yt-dlp.
Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd ocrvid
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
make test
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ocrvid-0.5.3.tar.gz.
File metadata
- Download URL: ocrvid-0.5.3.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ad428d034df18a62e68ce7d85cfeb5ab573f0bc1636c0db48fc5dad17a49b9f
|
|
| MD5 |
d127bcf57229df42332367f93d531712
|
|
| BLAKE2b-256 |
762a1c6aa1e56e110d7d7aa0c120d13f63019cee124b0556224158e538ed43b6
|
File details
Details for the file ocrvid-0.5.3-py3-none-any.whl.
File metadata
- Download URL: ocrvid-0.5.3-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a4d2475547a796934f2416fc59678b86331eeff81dbfd52495880f9693c9cd7
|
|
| MD5 |
b08008238661db37939a8ac33f4d59b8
|
|
| BLAKE2b-256 |
b76f8521404c0c54ec4e05e673b6a020d25ae2bc4c9a425e01aece7fc534cd11
|