RAGNARDoc (RAG Native Automatic Reingestion for Documents) is a tool that runs natively on a developer workstation and automatically ingests local documents into various Retrieval Augmented Generation indexes. It is designed as a companion app for workstation RAG applications which would benefit from maintaining an up-to-date view of documents hosted natively on a user's workstation.
Project description
RAGNARDoc
RAGNARDoc (RAG Native Automatic Reingestion for Documents) is a tool that runs natively on a developer workstation and automatically ingests local documents into various Retrieval Augmented Generation indexes. It is designed as a companion app for workstation RAG applications which would benefit from maintaining an up-to-date view of documents hosted natively on a user's workstation.
Quick Start
pip install ragnardoc
# Initialize ragnardoc on your system
ragnardoc init
# Add a directory to be ingested
ragnardoc add ~/Documents
# Run an ingestion
ragnardoc run
# Start as a background service
ragnardoc start & disown
Configuration
The configuration for RAGNARDoc is managed by a yaml file. The default location is $HOME/.ragnardoc/config.yaml
, but can be overloaded with the RAGNARDOC_HOME
environment variable. All default values can be found in config.yaml in the codebase.
Configuring
To initialize your RAGNARDoc config, do the following:
mkdir -p ~/.ragnardoc
echo "scraping:
roots:
# Fill in with the list of directories to ingest
- ~/Desktop
- ~/Documents
" > ~/.ragnardoc/config.yaml
Once done, you can add entries to your config.yaml
to add supported ingestion plugins (see below).
Ingestion Plugins
RAGNARDoc operates with a plugin model for connecting to applications to ingest docs. Each plugin is responsible for connecting to a given app. RAGNARDoc's native ingestion capabilities are:
AnythingLLM Desktop
To configure a connection to AnythingLLM, follow these steps:
- Download and install the desktop app from their site: https://anythingllm.com/desktop
- In the app, go to settings (wrench icon in the bottom panel of the left-hand sidebar)
- Under
Admin -> General Settings
, toggle onEnable network discovery
and wait for the app to reload - Under
Tools
, selectDeveloper API
- Create a new API Key
- Add the plugin to your config (default location
$HOME/.ragnardoc/config.yaml
)
ingestion:
plugins:
- type: anything-llm
config:
apikey: <YOUR API KEY>
Open WebUI
To configure a connection to Open WebUI, follow these steps:
- Follow the Getting Started guide to get Open WebUI running locally. TLDR:
pip install open_webui
# Run without login
WEBUI_AUTH=False open-webui serve
- Open the UI in a browser tab (http://localhost:8080 by default)
- Click on the user icon (top right) and select
Settings
- Click
Account
on the left panel of the settings view - Click
Show
(right side) forAPI keys
- Click
+ Create new secret key
underAPI Key
to create a new API Key - Click the copy icon to copy the api key
- Add the plugin to your config (default location
$HOME/.ragnardoc/config.yaml
)
ingestion:
plugins:
- type: open-webui
config:
apikey: <YOUR API KEY>
TODO
- Per-ingestor inclusion / exclusion
- Abstract scrapers to allow non-local scraping
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ragnardoc-0.1.1.tar.gz
.
File metadata
- Download URL: ragnardoc-0.1.1.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
115aabde9d5616ea811f128d9f1f32e878c1a71b711e8eb7b443c15f418133c8
|
|
MD5 |
1898c80aab5aa03feaf5a23ec3e489e5
|
|
BLAKE2b-256 |
2b2dd16fdbf43fd1a807dc7c8e4e1564291552ab58bc7282f74c6318b4139873
|
File details
Details for the file ragnardoc-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: ragnardoc-0.1.1-py3-none-any.whl
- Upload date:
- Size: 32.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
f724bc3afb3ed3102bca6a0a572ef7878c7c01634ad640c0d3e34315a1876f6d
|
|
MD5 |
641ce5911e96df7b995a18779d89b877
|
|
BLAKE2b-256 |
ebafb961390ef2e3783837cccc896648ea02dbd75527dadffd9b896fac38d6e3
|