Modular version of the Docling package: SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.
Project description
Docling Slim
Lightweight SDK for parsing documents with minimal dependencies and opt-in extras
Docling Slim is a minimal-dependency version of Docling that allows you to install only the components you need. It provides the core document processing functionality with ~50MB of base dependencies, and you can add specific features through optional extras.
When to Use Docling Slim
- Use
docling(recommended): If you want the full-featured experience with all standard capabilities - Use
docling-slim: If you need fine-grained control over dependencies or want to minimize installation size
For Most Users: Use the Main Docling Package
We recommend most users install the full-featured docling package instead:
pip install docling
The docling package includes all standard features, the CLI tools, and is the easiest way to get started. Visit the main Docling documentation for complete guides and examples.
Installation
With Specific Features
# PDF support with local models
pip install docling-slim[format-pdf,models-local]
# Office formats only
pip install docling-slim[format-office]
# PDF + CLI
pip install docling-slim[format-pdf,cli]
# Docling service client for using the Docling Serve API
pip install docling-slim[service-client]
Available Extras
Convenience Bundles
| Extra | Description | Use Case |
|---|---|---|
standard |
All standard features (same as docling package) |
Full-featured usage |
all |
All available extras | Complete installation |
CLI
| Extra | Description | Use Case |
|---|---|---|
cli |
Command-line interface (typer, rich) | CLI tools (docling, docling-tools) |
Core Components
| Extra | Description | Use Case |
|---|---|---|
convert-core |
Core conversion components (numpy, pillow, scipy) | Basic document conversion |
extract-core |
Structured information extraction | Data extraction from documents |
Format Support
PDF Formats
| Extra | Description | Use Case |
|---|---|---|
format-pdf |
PDF parsing (pypdfium2 + docling-parse) | PDF documents |
format-pdf-pypdfium2 |
PDF rendering only | Lightweight PDF support |
format-pdf-docling |
Advanced PDF parsing | Complex PDF layouts |
Office Formats (office = docx + pptx + xlsx)
| Extra | Description | Use Case |
|---|---|---|
format-office |
All Office formats | Microsoft Office documents |
format-docx |
Microsoft Word documents | .docx files |
format-pptx |
Microsoft PowerPoint | .pptx files |
format-xlsx |
Microsoft Excel | .xlsx files |
Web Formats (web = html + markdown)
| Extra | Description | Use Case |
|---|---|---|
format-web |
HTML and Markdown | Web content |
format-html |
HTML parsing | Web pages and HTML files |
format-markdown |
Markdown parsing | .md files |
Other Formats
| Extra | Description | Use Case |
|---|---|---|
format-latex |
LaTeX documents | .tex files |
format-xml-xbrl |
XBRL financial reports | Financial documents |
format-html-render |
HTML rendering with Playwright | Dynamic web content |
format-audio |
Audio transcription (Whisper) | .wav, .mp3 files |
OCR Engines
| Extra | Description | Use Case |
|---|---|---|
feat-ocr-rapidocr |
RapidOCR (lightweight) | Fast OCR |
feat-ocr-rapidocr-onnx |
RapidOCR with ONNX runtime | Optimized OCR |
feat-ocr-easyocr |
EasyOCR | Multi-language OCR |
feat-ocr-tesserocr |
Tesseract OCR | High-accuracy OCR |
feat-ocr-mac |
macOS native OCR | macOS only |
Models
| Extra | Description | Use Case |
|---|---|---|
models-local |
Local PyTorch models | GPU/CPU inference |
models-remote |
Remote model serving (Triton) | Production deployments |
models-onnxruntime |
ONNX Runtime acceleration | Optimized inference |
models-vlm-inline |
Vision Language Models | Image understanding, inline processing |
Other features
| Extra | Description | Use Case |
|---|---|---|
feat-chunking |
Document chunking | RAG applications |
service-client |
Docling service client | Remote processing |
License
MIT License - See LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docling_slim-2.92.0.tar.gz.
File metadata
- Download URL: docling_slim-2.92.0.tar.gz
- Upload date:
- Size: 387.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f54a2159a46cf00f4738888594c5a81372048d8a0d1a15dd279a7390641a04fa
|
|
| MD5 |
cf1ac6ad6ae070ab10523ba4ac03e587
|
|
| BLAKE2b-256 |
3c45ff4d565ccf69694b2ab3b0fcc7cd403243b8650c735bb1a42a48ba82bcaf
|
Provenance
The following attestation bundles were made for docling_slim-2.92.0.tar.gz:
Publisher:
pypi.yml on docling-project/docling
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docling_slim-2.92.0.tar.gz -
Subject digest:
f54a2159a46cf00f4738888594c5a81372048d8a0d1a15dd279a7390641a04fa - Sigstore transparency entry: 1399446368
- Sigstore integration time:
-
Permalink:
docling-project/docling@80f81b2799112d66a59b4a292cad3a86d41d06ad -
Branch / Tag:
refs/tags/v2.92.0 - Owner: https://github.com/docling-project
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@80f81b2799112d66a59b4a292cad3a86d41d06ad -
Trigger Event:
release
-
Statement type:
File details
Details for the file docling_slim-2.92.0-py3-none-any.whl.
File metadata
- Download URL: docling_slim-2.92.0-py3-none-any.whl
- Upload date:
- Size: 503.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6b7ea5b955c5c47e9bc36f828268b2b78c32f544842036d43d529746783e380
|
|
| MD5 |
74e9679c2c3f6d48d7e52772f6055fbc
|
|
| BLAKE2b-256 |
9112bdf2579d85e43e5944a146c1c4eab708103784a043fb12d793b524c1cd02
|
Provenance
The following attestation bundles were made for docling_slim-2.92.0-py3-none-any.whl:
Publisher:
pypi.yml on docling-project/docling
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docling_slim-2.92.0-py3-none-any.whl -
Subject digest:
e6b7ea5b955c5c47e9bc36f828268b2b78c32f544842036d43d529746783e380 - Sigstore transparency entry: 1399446379
- Sigstore integration time:
-
Permalink:
docling-project/docling@80f81b2799112d66a59b4a292cad3a86d41d06ad -
Branch / Tag:
refs/tags/v2.92.0 - Owner: https://github.com/docling-project
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@80f81b2799112d66a59b4a292cad3a86d41d06ad -
Trigger Event:
release
-
Statement type: