Universal capture → classify → route platform. Your AI-powered data pilot.
Project description
Arkiv
Universal capture → classify → route. Your AI-powered data pilot.
Arkiv takes any digital input — files, URLs, text — classifies it using AI, and routes it to the right destination. Think of it as an intelligent mail sorting facility for your digital life.
┌─────────┐
File ──────►│ │──► Archiv/Rechnungen
URL ──────►│ ARKIV │──► Leseliste/Artikel
Text ──────►│ │──► Code/Snippets
Mail ──────►│ classify │──► Review (unsicher)
│ route │──► Plugin: Webhook
└─────────┘──► Plugin: Custom
Features
- Universal Intake — Drop any file, paste text, pipe from stdin
- AI Classification — LLM-powered content understanding (Ollama, OpenAI, Anthropic, HuggingFace)
- Smart Routing — Category-based rules route items to folders, webhooks, or custom destinations
- Full-Text Search — SQLite FTS5 search across all processed items
- Plugin System — Extend with pip-installable plugins (powered by pluggy)
- Local-First — Your data stays on your machine. No cloud required.
- Filesystem Watcher — Auto-process files dropped into your inbox directory
Quick Start
# Install
pipx install "kurier @ git+https://github.com/HerrStolzier/kurier.git"
# Make sure Ollama is running with a model
ollama pull qwen2.5:7b
# Start Kurier
kurier
That's it. kurier launches the interactive TUI where you can classify files, search, monitor your inbox, and more — all from one interface.
Alternative install methods:
# With pip (requires a virtual environment) pip install "kurier @ git+https://github.com/HerrStolzier/kurier.git" # With uv uv pip install "kurier @ git+https://github.com/HerrStolzier/kurier.git"
CLI Commands
All features are also available as individual commands:
kurier # Interactive TUI (default)
kurier add invoice.pdf # Classify and route a file
kurier watch # Auto-process files in inbox
kurier search "Rechnung" # Hybrid keyword + semantic search
kurier status # Processing statistics
kurier undo # Undo last routing action
kurier export --format csv # Export all items as CSV
kurier doctor # Check system health
kurier init # Interactive setup wizard
Configuration
Kurier uses a TOML config file at ~/.config/kurier/config.toml:
[llm]
provider = "ollama"
model = "mistral"
base_url = "http://localhost:11434"
[embeddings]
model = "BAAI/bge-small-en-v1.5"
[routes.archiv]
type = "folder"
path = "~/Documents/Arkiv/Archiv"
categories = ["rechnung", "vertrag", "brief"]
confidence_threshold = 0.7
[routes.artikel]
type = "folder"
path = "~/Documents/Arkiv/Artikel"
categories = ["artikel", "paper", "tutorial"]
confidence_threshold = 0.6
LLM Providers
Arkiv supports any LLM provider via LiteLLM:
| Provider | Config |
|---|---|
| Ollama (local) | provider = "ollama", model = "qwen2.5:7b" |
| OpenAI | provider = "openai", model = "gpt-4o-mini" |
| Anthropic | provider = "anthropic", model = "claude-sonnet-4-5-20250514" |
| HuggingFace | provider = "huggingface", model = "meta-llama/..." |
REST API
Start the API server for external integrations, webhooks, and mobile capture:
# Install API dependencies
uv pip install "kurier[api] @ git+https://github.com/HerrStolzier/kurier.git"
# Start the server
kurier serve
# → http://127.0.0.1:8790/docs (Swagger UI)
Endpoints:
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/ingest/file |
Upload a file for classification |
POST |
/ingest/text |
Submit text for classification |
GET |
/search?q=... |
Hybrid keyword + semantic search |
GET |
/status |
Processing statistics |
GET |
/recent |
Recently processed items |
# Example: ingest a file via curl
curl -X POST http://localhost:8790/ingest/file -F "file=@invoice.pdf"
# Example: search
curl "http://localhost:8790/search?q=Telefonkosten&mode=auto"
Plugins
Arkiv is built to be extended. Plugins can:
- Pre-process content before classification
- Post-process classification results
- Add custom routes (Slack, Notion, webhooks, ...)
- React to routing events (notifications, logging, ...)
Writing a Plugin
# my_arkiv_plugin.py
from arkiv.plugins.spec import hookimpl
@hookimpl
def on_routed(path: str, destination: str, route_name: str) -> None:
"""Send a notification when a file is routed."""
print(f"Routed {path} → {destination}")
# pyproject.toml
[project.entry-points."arkiv.plugins"]
my-plugin = "my_arkiv_plugin"
See the Plugin Guide for details.
Architecture
src/arkiv/
├── cli.py # Typer CLI interface
├── core/
│ ├── config.py # TOML configuration
│ ├── classifier.py # LLM classification engine
│ ├── router.py # Route matching & execution
│ └── engine.py # Main pipeline orchestrator
├── db/
│ └── store.py # SQLite + FTS5 storage
├── inlets/
│ └── watch.py # Filesystem watcher
├── plugins/
│ ├── spec.py # Plugin hook specifications
│ └── manager.py # Plugin discovery (pluggy)
└── routes/ # Built-in route handlers
Roadmap
- Core pipeline: capture → classify → route
- CLI interface
- SQLite + FTS5 search
- Plugin system (pluggy)
- Filesystem watcher
- REST API inlet (FastAPI with auto-docs)
- Semantic search (FastEmbed + sqlite-vec hybrid search with RRF)
- Web dashboard (HTMX + Tailwind, no build step)
- Email inlet (IMAP fetch + .eml/.mbox import)
- Webhook route plugin (Slack, Discord, generic)
- OCR support (PyMuPDF + Tesseract)
- Self-audit system (duplicates, misclassifications, orphaned files)
- Interactive TUI (Textual)
- Undo & Export commands
- Retry logic with exponential backoff
- Transaction safety (pending → routed/failed)
- Custom categories via config
- Browser extension
Development
# Clone
git clone https://github.com/HerrStolzier/kurier.git
cd kurier
# Create venv and install in development mode
uv venv && source .venv/bin/activate
uv pip install -e ".[dev,api,ocr,tui]"
# Run tests
pytest
# Lint + type check
ruff check src/
mypy src/
License
MIT — see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kurier-0.4.2.tar.gz.
File metadata
- Download URL: kurier-0.4.2.tar.gz
- Upload date:
- Size: 312.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a432755115601080c5146f0c087534974f83e6f9232ad129b83122e322ed9758
|
|
| MD5 |
d610a7fa457a71aa0575eb8ccbc8997c
|
|
| BLAKE2b-256 |
757dc312da07d35545b4eb9feba279bd274263d0fa321eb5dc0a611067a63fb8
|
File details
Details for the file kurier-0.4.2-py3-none-any.whl.
File metadata
- Download URL: kurier-0.4.2-py3-none-any.whl
- Upload date:
- Size: 94.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88948f085a6eed1d97ba6ad2c641440dd8ac7688e6ae9b65c74a0f219eaa836b
|
|
| MD5 |
57afa255fbde15d4a2cd663029e9e69c
|
|
| BLAKE2b-256 |
da5fb49bf087d497d4e385f9bd6999261a79efd93b179d9dd0ed58586233aba0
|