Skip to main content

No project description provided

Project description

Mistral AI OCR

This is a simple script that uses the Mistral AI OCR API to get the Markdown text from a PDF or image file

Usage

Install the Requirements

To install the necessary requirements, run the following command:

pip install mistral-ai-ocr

Typical Usage

mistral-ai-ocr paper.pdf
mistral-ai-ocr paper.pdf --dpi 200
mistral-ai-ocr paper.pdf --api-key jrWjJE5lFketfB2sA6vvhQK2SoHQ6R39
mistral-ai-ocr paper.pdf -o revision
mistral-ai-ocr paper.pdf -e
mistral-ai-ocr paper.pdf -m FULL
mistral-ai-ocr page74.jpg -e
mistral-ai-ocr -j paper.json
mistral-ai-ocr -j paper.json -m TEXT_NO_PAGES -n

Arguments

Argument Description
input PDF or image file
-d DPI --dpi DPI DPI (dots per inch) setting for the PDF to image conversion. Defaults to 600
-k API_KEY --api-key API_KEY Mistral API key, can be set via the MISTRAL_API_KEY environment variable
-o OUTPUT --output OUTPUT output directory path. If not set, a directory will be created in the current working directory using the same stem (filename without extension) as the input file
-j JSON_OCR_RESPONSE --json-ocr-response JSON_OCR_RESPONSE path from which to load a pre-existing JSON OCR response (any input file will be ignored)
-m MODE --mode MODE mode of operation: either the name or numerical value of the mode. Defaults to FULL_NO_PAGES
-s PAGE_SEPARATOR --page-separator PAGE_SEPARATOR page separator to use when writing the Markdown file. Defaults to \n
-n --no-json do not write the JSON OCR response to a file. By default, the response is written
-e --load-dot-env load the .env file from the current directory using python-dotenv, to retrieve the Mistral API key
-E --load-path-dot-env load the .env file from the specified path using python-dotenv, to retrieve the Mistral API key. Defaults to ~/.mistral_ai_ocr.env

Modes

Value Name
0 FULL
1 FULL_ALT
2 FULL_NO_DIR
3 FULL_NO_PAGES
4 TEXT
5 TEXT_NO_PAGES

Given the input file paper.pdf, the directory structure for each mode is shown below:

0 - FULL

Structure

paper
├── full
│   ├── image1.png
│   ├── image2.png
│   ├── image3.png
│   └── paper.md
├── page_0
│   ├── image1.png
│   └── paper.md
├── page_1
│   ├── image2.png
│   └── paper.md
└── page_2
    ├── image3.png
    └── paper.md

1 - FULL_ALT

Structure

paper
├── image1.png
├── image2.png
├── image3.png
├── paper.md
├── page_0
│   ├── image1.png
│   └── paper.md
├── page_1
│   ├── image2.png
│   └── paper.md
└── page_2
    ├── image3.png
    └── paper.md

2 - FULL_NO_DIR

Structure

paper
├── image1.png
├── image2.png
├── image3.png
├── paper.md
├── paper0.md
├── paper1.md
└── paper2.md

3 - FULL_NO_PAGES default

Structure

paper
├── image1.png
├── image2.png
├── image3.png
└── paper.md

4 - TEXT

Structure

paper
├── paper.md
├── paper0.md
├── paper1.md
└── paper2.md

5 - TEXT_NO_PAGES

Structure

paper
└── paper.md

By default, the JSON response from the Mistral AI OCR API is saved in the output directory. To disable JSON output, use the -n or --no-json argument. To experiment with a different mode without using additional API calls, reuse an existing JSON response instead of the original input file

Mistral AI API Key

To obtain an API key, you need a Mistral AI account. Then visit https://admin.mistral.ai/organization/api-keys and click the Create new key button

To avoid using -e to load the .env file, you can create one at $HOME/.mistral_ai_ocr.env (where $HOME is your home directory). It will then be automatically loaded at the start of the script

For example, for an user called vavilov, the path would look like this:

  • Linux

    /home/vavilov/.mistral_ai_ocr.env  
    
  • macOS

    /Users/vavilov/.mistral_ai_ocr.env  
    
  • Windows

    C:\Users\vavilov\.mistral_ai_ocr.env  
    

and the content will be something like this:

MISTRAL_API_KEY=jrWjJE5lFketfB2sA6vvhQK2SoHQ6R39

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mistral_ai_ocr-1.5.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mistral_ai_ocr-1.5-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file mistral_ai_ocr-1.5.tar.gz.

File metadata

  • Download URL: mistral_ai_ocr-1.5.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mistral_ai_ocr-1.5.tar.gz
Algorithm Hash digest
SHA256 0641c75bdc056c431e1d58355c786ff4fac9d55f256cda7d92417ec750e69caf
MD5 cb3c362a106b8ee9f514e5bbb873f384
BLAKE2b-256 4b0cc458c554ef3b7df0eb4d909af20e718ce1b5483bec95b51bab3a0151b6c8

See more details on using hashes here.

File details

Details for the file mistral_ai_ocr-1.5-py3-none-any.whl.

File metadata

  • Download URL: mistral_ai_ocr-1.5-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mistral_ai_ocr-1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 16d7cd7281d6935701b77128724c0c679868f9ea59a3a05f3b60298220fed14f
MD5 173f69fc68ce14dbdc9b7201e2b6849c
BLAKE2b-256 6576f5c0291604db0f6819e1c3e7ffe53a0bd4ec69282a3e09fd085c654a6296

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page