A simple script that uses the Ollama API to get the Markdown text from a PDF or image file using the DeepSeek-OCR model

Project description

DeepSeek OCR Ollama

This is a simple script that uses the Ollama API to get the Markdown text from a PDF or image file using the DeepSeek-OCR model

Usage

Install the Requirements

To install the necessary requirements, run the following command:

pip install deepseek-ocr-ollama

To be used, Ollama is required and the deepseek-ocr model must be installed

ollama pull deepseek-ocr

Typical Usage

deepseek-ocr-ollama paper.pdf
deepseek-ocr-ollama paper.pdf --dpi 200
deepseek-ocr-ollama paper.pdf -o revision
deepseek-ocr-ollama paper.pdf -e
deepseek-ocr-ollama paper.pdf -m FULL
deepseek-ocr-ollama page74.jpg -e
deepseek-ocr-ollama OLLAMA_HOST=http://gauss:11434 receipt.pdf -e
deepseek-ocr-ollama -j paper.json
deepseek-ocr-ollama -j paper.json -m TEXT_NO_PAGES -n

Arguments

Argument		Description
	input	input PDF or image file
-d DPI	--dpi DPI	DPI (dots per inch) setting for the PDF to image conversion. Defaults to 600
-o OUTPUT	--output OUTPUT	output directory path. If not set, a directory will be created in the current working directory using the same stem (filename without extension) as the input file
-j JSON_OCR_RESPONSE	--json-ocr-response JSON_OCR_RESPONSE	path from which to load a pre-existing JSON OCR response (any input file will be ignored)
-m MODE	--mode MODE	mode of operation: either the name or numerical value of the mode. Defaults to FULL_NO_PAGES
-s PAGE_SEPARATOR	--page-separator PAGE_SEPARATOR	page separator to use when writing the Markdown file. Defaults to `\n`
-n	--no-json	do not write the JSON OCR response to a file. By default, the response is written
-e	--load-dot-env	load the .env file from the current directory using `python-dotenv`, to retrieve the Ollama environment variables
-E LOAD_PATH_DOT_ENV	--load-path-dot-env LOAD_PATH_DOT_ENV	load the .env file from the specified path using `python-dotenv`, to retrieve the Ollama environment variables. Defaults to ~/.deepseek_ocr_ollama.env
-M MODEL_NAME	--model-name MODEL_NAME	name of the Ollama model to use for OCR. Defaults to `deepseek-ocr`
-H HINT	--hint HINT	hint to provide to the OCR model to improve recognition accuracy. Ignored if raw prompt is set. The hint is a short instruction that will be mixed in with the main prompt
-R RAW_PROMPT	--raw-prompt RAW_PROMPT	raw prompt to provide to the OCR model, overriding the default prompt. Hint is ignored if this is set
-V VERBOSE	--verbose VERBOSE	verbosity level: 0 = silent, 1 = normal, 2 = debug. Defaults to 1

Modes

Value	Name
0	FULL
1	FULL_ALT
2	FULL_NO_DIR
3	FULL_NO_PAGES
4	TEXT
5	TEXT_NO_PAGES

Given the input file paper.pdf, the directory structure for each mode is shown below:

0 - `FULL`

Structure

paper
├── full
│   ├── image1.png
│   ├── image2.png
│   ├── image3.png
│   └── paper.md
├── page_0
│   ├── image1.png
│   └── paper.md
├── page_1
│   ├── image2.png
│   └── paper.md
└── page_2
    ├── image3.png
    └── paper.md

1 - `FULL_ALT`

Structure

paper
├── image1.png
├── image2.png
├── image3.png
├── paper.md
├── page_0
│   ├── image1.png
│   └── paper.md
├── page_1
│   ├── image2.png
│   └── paper.md
└── page_2
    ├── image3.png
    └── paper.md

2 - `FULL_NO_DIR`

Structure

paper
├── image1.png
├── image2.png
├── image3.png
├── paper.md
├── paper0.md
├── paper1.md
└── paper2.md

3 - `FULL_NO_PAGES` default

Structure

paper
├── image1.png
├── image2.png
├── image3.png
└── paper.md

4 - `TEXT`

Structure

paper
├── paper.md
├── paper0.md
├── paper1.md
└── paper2.md

5 - `TEXT_NO_PAGES`

Structure

paper
└── paper.md

By default, the JSON response from the DeepSeek-OCR model is saved in the output directory. To disable JSON output, use the -n or --no-json argument. To experiment with a different mode without using additional calls, reuse an existing JSON response instead of the original input file

Ollama's Environment Variables

The Ollama server can be modified using the environment variables available from the Python API:

OLLAMA_HOST : Ollama server host
OLLAMA_API_KEY : Used as Bearer authorization token

To avoid using -e to load the .env file, you can create one at $HOME/.deepseek_ocr_ollama.env (where $HOME is your home directory). It will then be automatically loaded at the start of the script

For example, for an user called vavilov, the path would look like this:

Linux

/home/vavilov/.deepseek_ocr_ollama.env

macOS

/Users/vavilov/.deepseek_ocr_ollama.env

Windows

C:\Users\vavilov\.deepseek_ocr_ollama.env

and the content will be something like this:

OLLAMA_HOST=http://gauss:11434

Project details

Release history Release notifications | RSS feed

This version

1.3

Dec 30, 2025

1.2

Dec 30, 2025

1.1

Dec 30, 2025

1.0

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepseek_ocr_ollama-1.3.tar.gz (10.4 kB view details)

Uploaded Dec 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

deepseek_ocr_ollama-1.3-py3-none-any.whl (10.3 kB view details)

Uploaded Dec 30, 2025 Python 3

File details

Details for the file deepseek_ocr_ollama-1.3.tar.gz.

File metadata

Download URL: deepseek_ocr_ollama-1.3.tar.gz
Upload date: Dec 30, 2025
Size: 10.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for deepseek_ocr_ollama-1.3.tar.gz
Algorithm	Hash digest
SHA256	`657e6fa923d9ada4dbc8b674a65d9b9228480f7572b970516733078318457858`
MD5	`fb12ce36eae7c21db60db5183d6eac47`
BLAKE2b-256	`3d68b13bae0f69daaee2b2b5b86b3368f3c66e6b40ebb766e6e93564b94186a5`

See more details on using hashes here.

File details

Details for the file deepseek_ocr_ollama-1.3-py3-none-any.whl.

File metadata

Download URL: deepseek_ocr_ollama-1.3-py3-none-any.whl
Upload date: Dec 30, 2025
Size: 10.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for deepseek_ocr_ollama-1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c64a52bbc25e1158a667f7a5dd85940501b7dd6dd3159ad8d4b5aac39b8a30a8`
MD5	`7bd2ed4c7ff96b9c54714d3930ef587c`
BLAKE2b-256	`cdecdfa64e55aa339783fa8b013ca2d76040ac739778df292f528d018661bc71`

See more details on using hashes here.

deepseek-ocr-ollama 1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Project description

DeepSeek OCR Ollama

Usage

Install the Requirements

Typical Usage

Arguments

Modes

0 - `FULL`

1 - `FULL_ALT`

2 - `FULL_NO_DIR`

3 - `FULL_NO_PAGES` default

4 - `TEXT`

5 - `TEXT_NO_PAGES`

Ollama's Environment Variables

Project details

Verified details

Maintainers

Unverified details

Project links

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

deepseek-ocr-ollama 1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Project description

DeepSeek OCR Ollama

Usage

Install the Requirements

Typical Usage

Arguments

Modes

0 - FULL

1 - FULL_ALT

2 - FULL_NO_DIR

3 - FULL_NO_PAGES default

4 - TEXT

5 - TEXT_NO_PAGES

Ollama's Environment Variables

Project details

Verified details

Maintainers

Unverified details

Project links

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

0 - `FULL`

1 - `FULL_ALT`

2 - `FULL_NO_DIR`

3 - `FULL_NO_PAGES` default

4 - `TEXT`

5 - `TEXT_NO_PAGES`