Skip to main content

A CLI tool for OCR using the Nougat model

Project description

MLX Nougat

MLX Nougat is a CLI tool for OCR using the Nougat model.

Installation

  1. Install ImageMagick:

    brew install imagemagick
    
  2. Configure environment variables for ImageMagick:

    Add the following lines to your shell configuration file (e.g., ~/.bashrc, ~/.zshrc):

    export MAGICK_HOME=$(brew --prefix imagemagick)
    export PATH=$MAGICK_HOME/bin:$PATH
    export DYLD_LIBRARY_PATH=$MAGICK_HOME/lib:$DYLD_LIBRARY_PATH
    

    After adding these lines, reload your shell configuration or restart your terminal.

  3. Install MLX Nougat:

    git clone git@github.com:mzbac/mlx-nougat.git
    cd mlx-nougat
    pip install .
    

Usage

After installation, you can use MLX Nougat from the command line:

mlx_nougat --input <path_to_image_or_pdf_or_url> [--output <output_file>] [--model <model_name_or_path>]

Arguments

  • --input: (Required) Path to the input image or PDF file, or a URL to an image or PDF.
  • --output: (Optional) Path to save the output text file. If not provided, the output will be printed to the console.
  • --model: (Optional) Name or path of the Nougat model to use. Default is "facebook/nougat-small".

Examples

  1. Process a local image:

    mlx_nougat --input path/to/your/image.png --output results.txt
    
  2. Process a local PDF:

    mlx_nougat --input path/to/your/document.pdf --output results.txt
    
  3. Process a remote image:

    mlx_nougat --input https://example.com/image.jpg --output results.txt
    
  4. Process a remote PDF:

    mlx_nougat --input https://example.com/document.pdf --output results.txt
    
  5. Use a different model:

    mlx_nougat --input path/to/your/image.png --model facebook/nougat-base --output results.txt
    
  6. Use a quantized model:

    mlx_nougat --input path/to/your/document.pdf --model mzbac/nougat-small-8bit-mlx
    

TODOs

  • Support quantized model to improve the performance.

Acknowledgements

This project is built upon several open-source projects and research works:

  • Nougat: The original Nougat model developed by Facebook AI Research.
  • faster-nougat: An optimized implementation of Nougat, which inspired this MLX-based version.
  • MLX: The machine learning framework developed by Apple, used for efficient model inference in this project.
  • Transformers: Hugging Face's state-of-the-art natural language processing library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_nougat-0.1.2.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

mlx_nougat-0.1.2-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file mlx_nougat-0.1.2.tar.gz.

File metadata

  • Download URL: mlx_nougat-0.1.2.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mlx_nougat-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4bdb02f05c8569725c2953ff659164a0bfc39685ca66951676d25927441ac33e
MD5 61204a69ad338679a7a41a8658f1d87b
BLAKE2b-256 b075ed31c1e8d958dc4e729db3fd4bce80df3af1c9c740318292b678a4eb4f7d

See more details on using hashes here.

File details

Details for the file mlx_nougat-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mlx_nougat-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mlx_nougat-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8ba15588cc98919ef35ab23945741a9824d870a7a54c76c3696c21ed0952acfb
MD5 7795d91e0c7a6661710bb068c921ad72
BLAKE2b-256 5d1c64877ad294120426975f0ad4722271df53db7490b99179df088e5c4e6c1d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page