Skip to main content

Modular version of the Docling package: SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

Project description

Docling Slim

Lightweight SDK for parsing documents with minimal dependencies and opt-in extras

Docling Slim is a minimal-dependency version of Docling that allows you to install only the components you need. It provides the core document processing functionality with ~50MB of base dependencies, and you can add specific features through optional extras.

When to Use Docling Slim

  • Use docling (recommended): If you want the full-featured experience with all standard capabilities
  • Use docling-slim: If you need fine-grained control over dependencies or want to minimize installation size

For Most Users: Use the Main Docling Package

We recommend most users install the full-featured docling package instead:

pip install docling

The docling package includes all standard features, the CLI tools, and is the easiest way to get started. Visit the main Docling documentation for complete guides and examples.

Installation

With Specific Features

# PDF support with local models
pip install docling-slim[format-pdf,models-local]

# Office formats only
pip install docling-slim[format-office]

# PDF + CLI
pip install docling-slim[format-pdf,cli]

# Docling service client for using the Docling Serve API
pip install docling-slim[service-client]

Available Extras

Convenience Bundles

Extra Description Use Case
standard All standard features (same as docling package) Full-featured usage
all All available extras Complete installation

CLI

Extra Description Use Case
cli Command-line interface (typer, rich) CLI tools (docling, docling-tools)

Core Components

Extra Description Use Case
convert-core Core conversion components (numpy, pillow, scipy) Basic document conversion
extract-core Structured information extraction Data extraction from documents

Format Support

PDF Formats

Extra Description Use Case
format-pdf PDF parsing (pypdfium2 + docling-parse) PDF documents
format-pdf-pypdfium2 PDF rendering only Lightweight PDF support
format-pdf-docling Advanced PDF parsing Complex PDF layouts

Office Formats (office = docx + pptx + xlsx)

Extra Description Use Case
format-office All Office formats Microsoft Office documents
format-docx Microsoft Word documents .docx files
format-pptx Microsoft PowerPoint .pptx files
format-xlsx Microsoft Excel .xlsx files

Web Formats (web = html + markdown)

Extra Description Use Case
format-web HTML and Markdown Web content
format-html HTML parsing Web pages and HTML files
format-markdown Markdown parsing .md files

Other Formats

Extra Description Use Case
format-latex LaTeX documents .tex files
format-xml-xbrl XBRL financial reports Financial documents
format-html-render HTML rendering with Playwright Dynamic web content
format-audio Audio transcription (Whisper) .wav, .mp3 files

OCR Engines

Extra Description Use Case
feat-ocr-rapidocr RapidOCR (lightweight) Fast OCR
feat-ocr-rapidocr-onnx RapidOCR with ONNX runtime Optimized OCR
feat-ocr-easyocr EasyOCR Multi-language OCR
feat-ocr-tesserocr Tesseract OCR High-accuracy OCR
feat-ocr-mac macOS native OCR macOS only

Models

Extra Description Use Case
models-local Local PyTorch models GPU/CPU inference
models-remote Remote model serving (Triton) Production deployments
models-onnxruntime ONNX Runtime acceleration Optimized inference
models-vlm-inline Vision Language Models Image understanding, inline processing

Other features

Extra Description Use Case
feat-chunking Document chunking RAG applications
service-client Docling service client Remote processing

License

MIT License - See LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_slim-2.92.0.tar.gz (387.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docling_slim-2.92.0-py3-none-any.whl (503.5 kB view details)

Uploaded Python 3

File details

Details for the file docling_slim-2.92.0.tar.gz.

File metadata

  • Download URL: docling_slim-2.92.0.tar.gz
  • Upload date:
  • Size: 387.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docling_slim-2.92.0.tar.gz
Algorithm Hash digest
SHA256 f54a2159a46cf00f4738888594c5a81372048d8a0d1a15dd279a7390641a04fa
MD5 cf1ac6ad6ae070ab10523ba4ac03e587
BLAKE2b-256 3c45ff4d565ccf69694b2ab3b0fcc7cd403243b8650c735bb1a42a48ba82bcaf

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_slim-2.92.0.tar.gz:

Publisher: pypi.yml on docling-project/docling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docling_slim-2.92.0-py3-none-any.whl.

File metadata

  • Download URL: docling_slim-2.92.0-py3-none-any.whl
  • Upload date:
  • Size: 503.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docling_slim-2.92.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e6b7ea5b955c5c47e9bc36f828268b2b78c32f544842036d43d529746783e380
MD5 74e9679c2c3f6d48d7e52772f6055fbc
BLAKE2b-256 9112bdf2579d85e43e5944a146c1c4eab708103784a043fb12d793b524c1cd02

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_slim-2.92.0-py3-none-any.whl:

Publisher: pypi.yml on docling-project/docling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page