Skip to main content

Modular version of the Docling package: SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

Project description

Docling Slim

Lightweight SDK for parsing documents with minimal dependencies and opt-in extras

Docling Slim is a minimal-dependency version of Docling that allows you to install only the components you need. It provides the core document processing functionality with ~50MB of base dependencies, and you can add specific features through optional extras.

When to Use Docling Slim

  • Use docling (recommended): If you want the full-featured experience with all standard capabilities
  • Use docling-slim: If you need fine-grained control over dependencies or want to minimize installation size

For Most Users: Use the Main Docling Package

We recommend most users install the full-featured docling package instead:

pip install docling

The docling package includes all standard features, the CLI tools, and is the easiest way to get started. Visit the main Docling documentation for complete guides and examples.

Installation

With Specific Features

# PDF support with local models
pip install docling-slim[format-pdf,models-local]

# Office formats only
pip install docling-slim[format-office]

# PDF + CLI
pip install docling-slim[format-pdf,cli]

# Docling service client for using the Docling Serve API
pip install docling-slim[service-client]

Available Extras

Convenience Bundles

Extra Description Use Case
standard All standard features (same as docling package) Full-featured usage
all All available extras Complete installation

CLI

Extra Description Use Case
cli Command-line interface (typer, rich) CLI tools (docling, docling-tools)

Core Components

Extra Description Use Case
convert-core Core conversion components (numpy, pillow, scipy) Basic document conversion
extract-core Structured information extraction Data extraction from documents

Format Support

PDF Formats

Extra Description Use Case
format-pdf PDF parsing (pypdfium2 + docling-parse) PDF documents
format-pdf-pypdfium2 PDF rendering only Lightweight PDF support
format-pdf-docling Advanced PDF parsing Complex PDF layouts

Office Formats (office = docx + pptx + xlsx)

Extra Description Use Case
format-office All Office formats Microsoft Office documents
format-docx Microsoft Word documents .docx files
format-pptx Microsoft PowerPoint .pptx files
format-xlsx Microsoft Excel .xlsx files

Web Formats (web = html + markdown)

Extra Description Use Case
format-web HTML and Markdown Web content
format-html HTML parsing Web pages and HTML files
format-markdown Markdown parsing .md files

Other Formats

Extra Description Use Case
format-latex LaTeX documents .tex files
format-xml-xbrl XBRL financial reports Financial documents
format-html-render HTML rendering with Playwright Dynamic web content
format-audio Audio transcription (Whisper) .wav, .mp3 files

OCR Engines

Extra Description Use Case
feat-ocr-rapidocr RapidOCR (lightweight) Fast OCR
feat-ocr-rapidocr-onnx RapidOCR with ONNX runtime Optimized OCR
feat-ocr-easyocr EasyOCR Multi-language OCR
feat-ocr-tesserocr Tesseract OCR High-accuracy OCR
feat-ocr-mac macOS native OCR macOS only

Models

Extra Description Use Case
models-local Local PyTorch models GPU/CPU inference
models-remote Remote model serving (Triton) Production deployments
models-onnxruntime ONNX Runtime acceleration Optimized inference
models-vlm-inline Vision Language Models Image understanding, inline processing

Other features

Extra Description Use Case
feat-chunking Document chunking RAG applications
service-client Docling service client Remote processing

License

MIT License - See LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_slim-2.98.0.tar.gz (408.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docling_slim-2.98.0-py3-none-any.whl (529.3 kB view details)

Uploaded Python 3

File details

Details for the file docling_slim-2.98.0.tar.gz.

File metadata

  • Download URL: docling_slim-2.98.0.tar.gz
  • Upload date:
  • Size: 408.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for docling_slim-2.98.0.tar.gz
Algorithm Hash digest
SHA256 7be12e427cf69699435d48a37b5e53315f2609ca48a42e182582262c86ad9f57
MD5 852cb953747b05acc6f98b0cd5b2a008
BLAKE2b-256 63fc8b48e33ef2cab5b3dab9df53385909ee70ccf9e5232c4b4720c0f2035a96

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_slim-2.98.0.tar.gz:

Publisher: pypi.yml on docling-project/docling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docling_slim-2.98.0-py3-none-any.whl.

File metadata

  • Download URL: docling_slim-2.98.0-py3-none-any.whl
  • Upload date:
  • Size: 529.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for docling_slim-2.98.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c4d6632ac314ec817e33703bc32e7bfdfd411ddb0dccc749961915983698421
MD5 c926b0711c3e983aa39d544cf8dda071
BLAKE2b-256 307b1882d691d0503ea33142dd339a1d4fed2a1675081fc1034aec703f61a410

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_slim-2.98.0-py3-none-any.whl:

Publisher: pypi.yml on docling-project/docling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page