Skip to main content

A CLI tool for OCR using the Nougat model

Project description

MLX Nougat

MLX Nougat is a CLI tool for OCR using the Nougat model.

Installation

  1. Install ImageMagick:

    brew install imagemagick
    
  2. Configure environment variables for ImageMagick:

    Add the following lines to your shell configuration file (e.g., ~/.bashrc, ~/.zshrc):

    export MAGICK_HOME=$(brew --prefix imagemagick)
    export PATH=$MAGICK_HOME/bin:$PATH
    export DYLD_LIBRARY_PATH=$MAGICK_HOME/lib:$DYLD_LIBRARY_PATH
    

    After adding these lines, reload your shell configuration or restart your terminal.

  3. Install MLX Nougat:

    git clone git@github.com:mzbac/mlx-nougat.git
    cd mlx-nougat
    pip install .
    

Usage

After installation, you can use MLX Nougat from the command line:

mlx_nougat --input <path_to_image_or_pdf_or_url> [--output <output_file>] [--model <model_name_or_path>]

Arguments

  • --input: (Required) Path to the input image or PDF file, or a URL to an image or PDF.
  • --output: (Optional) Path to save the output text file. If not provided, the output will be printed to the console.
  • --model: (Optional) Name or path of the Nougat model to use. Default is "facebook/nougat-small".

Examples

  1. Process a local image:

    mlx_nougat --input path/to/your/image.png --output results.txt
    
  2. Process a local PDF:

    mlx_nougat --input path/to/your/document.pdf --output results.txt
    
  3. Process a remote image:

    mlx_nougat --input https://example.com/image.jpg --output results.txt
    
  4. Process a remote PDF:

    mlx_nougat --input https://example.com/document.pdf --output results.txt
    
  5. Use a different model:

    mlx_nougat --input path/to/your/image.png --model facebook/nougat-base --output results.txt
    
  6. Use a quantized model:

    mlx_nougat --input path/to/your/document.pdf --model mzbac/nougat-small-8bit-mlx
    

TODOs

  • Support quantized model to improve the performance.

Acknowledgements

This project is built upon several open-source projects and research works:

  • Nougat: The original Nougat model developed by Facebook AI Research.
  • faster-nougat: An optimized implementation of Nougat, which inspired this MLX-based version.
  • MLX: The machine learning framework developed by Apple, used for efficient model inference in this project.
  • Transformers: Hugging Face's state-of-the-art natural language processing library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_nougat-0.1.1.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_nougat-0.1.1-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file mlx_nougat-0.1.1.tar.gz.

File metadata

  • Download URL: mlx_nougat-0.1.1.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for mlx_nougat-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6c545bd15a619db1b6bc138822f43bdad57a2497cbd6a2ee81c3328db6d212bf
MD5 3dcfcc9c880fe880d4caf02ea60f69b9
BLAKE2b-256 34d0bcc74326f4ecf025c13f10d8eb225e96865dcc47d8edf3d90d6cf347bb31

See more details on using hashes here.

File details

Details for the file mlx_nougat-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mlx_nougat-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for mlx_nougat-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e5a8b4940e9b842df77183d94ce8345a73f91eb2d7d9a833fcab43707e4b2d77
MD5 94530d0d6aa6147802b23f11c408ba51
BLAKE2b-256 b5e9295a2ffbc03f8f2f2722422a3230c11899a0f43bc06e415841c800582488

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page