Skip to main content

CLI tool for OCR-ing video frames on macOS

Project description

ocrvid

PyPI Changelog Tests License

CLI tool to extract text from videos using OCR on macOS.

[!NOTE] Currently, this tool only tested and works on macOS 13 or later.

[!CAUTION] This tool is still in early development stage. Current v0.x releases are not stable and may have breaking changes.

Installation

Install this tool using pip:

pip install ocrvid

Usage

Usage: ocrvid [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  detect  Run OCR on a single picture, and print the results as json
  langs   Show supported recognition languages
  props   Show properties of video file
  run     Run OCR on a video, and save result as a json file

Run OCR on a video

Use ocr run sub command to run ocr on a video file:

Usage: ocrvid run [OPTIONS] INPUT_VIDEO

  Run OCR on a video, and save result as a json file

Options:
  -o, --output FILE            Path to output json file. By default, if you run
                               `ocrvid run some/video.mp4` then the output file
                               will be `./video.json`
  -fd, --frames-dir DIRECTORY  If passed, then save video frames to this
                               directory. By default, frames are not saved.
  -fs, --frame-step INTEGER    Number of frames to skip between each frame to be
                               processed. By default, 100 which means every 100
                               frames, 1 frame will be processed.
  -bs, --by-second FLOAT       If passed, then process 1 frame every N seconds.
                               This option relies on fps metadata of the video.
  -l, --langs TEXT             Prefered languages to detect, ordered by
                               priority. See avalable languages run by `ocrvid
                               langs`. If not passed, language is auto detected.
  --help                       Show this message and exit.

For example, run against the test video file at tests/video/pexels-eva-elijas.mp4 in this repo:

ocrvid run tests/video/pexels-eva-elijas.mp4

Then pexels-eva-elija.json is generated in the current directory which looks like this:

{
    "video_file":"tests/video/pexels-eva-elijas.mp4",
    "frames":[
        {
            "frame_index":0,
            "results":[
                {
                    "text":"INSPIRING WORDS",
                    "confidence":1.0,
                    "bbox":[
                        0.17844826551211515,
                        0.7961793736859821,
                        0.3419540405273438,
                        0.10085802570754931
                    ]
                },
                {
                    "text":"\"Foar kills more dre",
                    "confidence":1.0,
                    "bbox":[
                        0.0724226723609706,
                        0.6839455987759758,
                        0.4780927975972494,
                        0.14592710683043575
                    ]
                },
                {
                    "text":"than failure ever",
                    "confidence":1.0,
                    "bbox":[
                        0.018455287246445035,
                        0.6549868414269003,
                        0.45329265594482426,
                        0.14363905857426462
                    ]
                },
                {
                    "text":"IZY KASSEM",
                    "confidence":0.5,
                    "bbox":[
                        -0.015967150208537523,
                        0.6675747977206025,
                        0.23065692583719888,
                        0.08114868486431293
                    ]
                },
                {
                    "text":"Entrepreneur",
                    "confidence":1.0,
                    "bbox":[
                        0.01941176222542875,
                        0.1353812367971159,
                        0.9058370590209961,
                        0.26137274083956863
                    ]
                }
            ]
        },
...

Show supported languages

You can run ocrvid langs to show supported languages to detect. Results may change depending on running macos version.

On macOS version:

platform.mac_ver()[0]='14.2.1'

Result of ocrvid langs:

en-US
fr-FR
it-IT
de-DE
es-ES
pt-BR
zh-Hans
zh-Hant
yue-Hans
yue-Hant
ko-KR
ja-JP
ru-RU
uk-UA
th-TH
vi-VT

How can I run OCR on YouTube videos?

Take a look at yt-dlp.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd ocrvid
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

make test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocrvid-0.5.3.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

ocrvid-0.5.3-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file ocrvid-0.5.3.tar.gz.

File metadata

  • Download URL: ocrvid-0.5.3.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for ocrvid-0.5.3.tar.gz
Algorithm Hash digest
SHA256 9ad428d034df18a62e68ce7d85cfeb5ab573f0bc1636c0db48fc5dad17a49b9f
MD5 d127bcf57229df42332367f93d531712
BLAKE2b-256 762a1c6aa1e56e110d7d7aa0c120d13f63019cee124b0556224158e538ed43b6

See more details on using hashes here.

File details

Details for the file ocrvid-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: ocrvid-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for ocrvid-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4a4d2475547a796934f2416fc59678b86331eeff81dbfd52495880f9693c9cd7
MD5 b08008238661db37939a8ac33f4d59b8
BLAKE2b-256 b76f8521404c0c54ec4e05e673b6a020d25ae2bc4c9a425e01aece7fc534cd11

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page