Local PDF Q&A with RAG using Ollama & LangChain
Project description
zenpdf
Local PDF Q&A with RAG using Ollama & LangChain
A peaceful CLI tool for chatting with your documents using local AI models. All processing happens on your machine - no cloud APIs, no data leaves your device.
Features
- ๐ Local-First - No cloud APIs, all processing on your machine
- ๐ Multi-Format - PDF, DOCX, and TXT support
- โก Streaming - Real-time AI responses as they're generated
- ๐พ Persistent History - Chat history saved between sessions
- ๐ Source Attribution - Know which documents informed each answer
- ๐จ Beautiful CLI - Rich terminal interface with colors and tables
- โ๏ธ Fully Configurable - Customize models, chunk sizes, and more
Installation
pip install zenpdf
Quick Start
# 1. Make sure Ollama is running
ollama
# 2. Pull required models (if not already done)
ollama pull llama3.2:1b
ollama pull nomic-embed-text
# 3. Index a document
zenpdf index ./my-document.pdf
# 4. Ask questions!
zenpdf ask "What is this document about?"
# 5. Interactive mode
zenpdf interactive
Commands
Document Operations
| Command | Description |
|---|---|
zenpdf index <path> |
Index PDF/DOCX/TXT file or directory |
zenpdf list |
List indexed documents |
zenpdf remove <id> |
Remove document by ID |
zenpdf clear |
Clear all documents |
Query Operations
| Command | Description |
|---|---|
zenpdf ask "question?" |
Ask a question |
zenpdf ask "??" -k 6 |
Custom k chunks |
zenpdf interactive |
Interactive Q&A mode |
Reference & History
| Command | Description |
|---|---|
zenpdf refs |
Show sources for last answer |
zenpdf history |
Show chat history |
zenpdf export <file> |
Export history (MD/JSON) |
Configuration
| Command | Description |
|---|---|
zenpdf config show |
Show all config |
zenpdf config model <name> |
Set LLM model |
zenpdf config embed <name> |
Set embedding model |
zenpdf config chunk-size <n> |
Set chunk size |
zenpdf config overlap <n> |
Set chunk overlap |
zenpdf config k <n> |
Set default retrieved chunks |
zenpdf config db-path <path> |
Set database path |
zenpdf config history-size <n> |
Set max history size |
Utilities
| Command | Description |
|---|---|
zenpdf status |
Show database status |
zenpdf reset |
Reset vector store |
zenpdf --version |
Show version |
zenpdf --help |
Show help |
Configuration
Default settings (view with zenpdf config show):
| Setting | Default | Description |
|---|---|---|
model |
llama3.2:1b | Ollama LLM model |
embed_model |
nomic-embed-text | Embedding model |
chunk_size |
1000 | Text chunk size |
chunk_overlap |
100 | Chunk overlap |
k |
4 | Retrieved chunks |
db_path |
./zenpdf_db | Vector database path |
history_size |
50 | Max chat history |
temperature |
0.7 | LLM temperature |
Configuration is saved to .zenpdf_config.json in your working directory.
Requirements
- Python 3.11+
- Ollama installed and running
- Ollama models:
llama3.2:1b(or your preferred model)nomic-embed-text(for embeddings)
Architecture
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Document โโโโโโถโ Splitter โโโโโโถโ ChromaDB โ
โ Loader โ โ (Chunks) โ โ (Vectors) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโโโโโโโโโ โ
โ Ollama โโโโโโโโโโโโโ
โ (Embeddings)โ
โโโโโโโโฌโโโโโโโโ
โ
โโโโโโโโผโโโโโโโโ
โ RAG Chain โ
โโโโโโโโฌโโโโโโโโ
โ
โโโโโโโโผโโโโโโโโ
โ LLM โ
โ (Ollama) โ
โโโโโโโโโโโโโโโโ
Tech Stack
- LangChain - LLM orchestration
- Chroma - Vector database
- Ollama - Local LLMs
- Click - CLI framework
- Rich - Terminal formatting
License
MIT License - see LICENSE
Contributing
Contributions welcome! Please open an issue or submit a PR.
Made with โค๏ธ for local AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zenpdf-0.1.1.tar.gz.
File metadata
- Download URL: zenpdf-0.1.1.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0eef4334f31c38a5113c51501dbb05b2e031b9b4a3334304ab9da09ac7c1dcf
|
|
| MD5 |
db75f888535a9694180e029521f3e47e
|
|
| BLAKE2b-256 |
3dcb94c143d6b5fb2df95cca898fe9811b108921e687eb5f49d65bd99e67bd9b
|
File details
Details for the file zenpdf-0.1.1-py3-none-any.whl.
File metadata
- Download URL: zenpdf-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43dc5dedafb2b55f8a080d2219bc8ff3d3e9fee654feb7e919ed08aecbac4bb7
|
|
| MD5 |
d058128ab19e0012dc09c56c5a674aa8
|
|
| BLAKE2b-256 |
027ab1bd8f3c6a7f884e4c4ec77f77e2a4928f96c7c3b1fb8f58b6a8173188d1
|