Skip to main content

Docling → Chroma → Ollama: Simple RAG pipeline

Project description

📄 DocRAG LLM

Docling → Chroma → Ollama: Simple Local RAG Pipeline

PyPI
Python
License


🔎 What is DocRAG LLM?

DocRAG LLM is a local-first Retrieval-Augmented Generation (RAG) pipeline.
It connects Docling for parsing → ChromaDB for vector storage → Ollama for local LLM inference.

No cloud lock-in. No API costs. Just local docs → local vectors → local LLMs.


✨ Features

  • 🔍 Parse documents with Docling (PDF, DOCX, PPTX, HTML, etc.)
  • 📑 Intelligent chunking for retrieval
  • 🧠 Store embeddings in ChromaDB
  • 🤖 Answer questions using Ollama (default: llama3.2:1b)
  • 🛡️ Privacy-first → all local execution
  • 🖥️ Use as a CLI tool or Python library

📦 Installation

pip install docrag-llm

Requirements:

  • Python 3.10+
  • Ollama installed & running
  • Local models:
    ollama pull llama3.2:1b -- or any other model
    ollama pull nomic-embed-text
    

🚀 Quickstart

CLI – Ingest and Ask

# Ingest a document (default collection: demo)
python -m docrag.cli ingest https://arxiv.org/pdf/2508.20755

# Ask a question (default LLM: llama3.2:1b) you can always add the tag `-llm gpt-oss:20b` for better response assuming you have the computing power for it. 
python -m docrag.cli ask "Summarize in 1 paragraph with 5 bullet points" 
python -m docrag.cli ask "Summarize in 1 paragraph with 5 bullet points" -llm gpt-oss:20b

Python API - llama3.2:1b is just for testing, reccomend if you have the computing power to ues gpt-oss:20b you will get better results. change as needed, pull from ollama first.

from docrag import DocragSettings, RAGPipeline

cfg = DocragSettings(
    persist_path="./.chroma",
    collection="demo",
    embed_model="nomic-embed-text",
    llm_model="llama3.2:1b",
)

pipeline = RAGPipeline(cfg)

# Ingest
n_chunks = pipeline.ingest("https://arxiv.org/pdf/2508.20755")
print(f"Ingested {n_chunks} chunks")

# Ask
answer = pipeline.ask("Give a concise bullet summary of the paper's contributions.")
print(answer)

⚙️ Configuration

Both CLI & Python API let you customize:

  • persist_path → where ChromaDB stores vectors
  • collection → logical collection name
  • embed_model → embedding model (Ollama tag)
  • llm_model → LLM model (default: llama3.2:1b)
  • chunk_chars / chunk_overlap → chunking granularity

📊 Roadmap

  • model-check CLI → list installed Ollama models
  • Support multiple backends (Weaviate, Milvus)
  • Streaming output for long answers
  • Expanded test suite (large document regression cases)
  • Example notebooks & Hugging Face demo

🤝 Contributing

PRs and issues welcome!

pip install "docrag-llm[dev]"
ruff check .
pytest

📜 License

MIT License © 2025 Armando Medina


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docrag_llm-0.1.27.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docrag_llm-0.1.27-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file docrag_llm-0.1.27.tar.gz.

File metadata

  • Download URL: docrag_llm-0.1.27.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for docrag_llm-0.1.27.tar.gz
Algorithm Hash digest
SHA256 527f0b5a770053f8d790f1984265dbcf25fd1929d58f1134357bb917cd1ca683
MD5 18e65dd7246b79f6c7181e28342d507c
BLAKE2b-256 f5766ee854c433c116bdbdd54fb6ecffec918f04cd137e7d49b7ee5a1b601a76

See more details on using hashes here.

File details

Details for the file docrag_llm-0.1.27-py3-none-any.whl.

File metadata

  • Download URL: docrag_llm-0.1.27-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for docrag_llm-0.1.27-py3-none-any.whl
Algorithm Hash digest
SHA256 b3412fb3baee2e6dae2d14aa30d87b55cfd84df58a5218e650c02b2b1bdec61f
MD5 f1ec9c7330771a2ca43a01b7d9ebf436
BLAKE2b-256 beffd8339966cb9b030a6ca334997c7213fd80b7554eba5f5ae93bf49dba39d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page