Skip to main content

Chat with Wikipedia

Project description

RAG-demo

Chat with (a small portion of) Wikipedia

⚠️ RAG functionality is still under development. ⚠️

app screenshot

Requirements

  1. The uv Python package manager
    • Installing and updating uv is easy by following the docs.
    • As of 2026-01-25, I'm developing using uv version 0.9.26, and using the new experimental --pytorch-backend option.
  2. A terminal emulator or web browser

Notes on terminal emulators

Certain terminal emulators will not work with some features of this program. In particular, on macOS consider using iTerm2 instead of the default Terminal.app (explanation). On Linux you might want to try kitty, wezterm, alacritty, or ghostty, instead of the terminal that came with your desktop environment (reason). Windows Terminal should be fine as far as I know.

Optional dependencies

  1. Hugging Face login
  2. API key for your favorite LLM provider (support coming soon)
  3. Ollama installed on your system if you have a GPU
  4. Run RAG-demo on a more capable (bigger GPU) machine over SSH if you can. It is a terminal app after all.
  5. A C compiler if you want to build Llama.cpp from source.

Run the latest version

Run in a terminal:

uvx --python=3.12 --torch-backend=auto --from=jehoctor-rag-demo@latest chat

Or run in a web browser:

uvx --python=3.12 --torch-backend=auto --from=jehoctor-rag-demo@latest textual serve chat

CUDA acceleration via Llama.cpp

If you have an NVIDIA GPU with CUDA and build tools installed, you might be able to get CUDA acceleration without installing Ollama.

CMAKE_ARGS="-DGGML_CUDA=on" uv run --extra=llamacpp chat

Metal acceleration via Llama.cpp (on Apple Silicon)

On an Apple Silicon machine, make sure uv runs an ARM interpreter as this should cause it to install Llama.cpp with Metal support. Also, run with the extra group llamacpp. Try this:

uvx --python-platform=aarch64-apple-darwin --torch-backend=auto --from='jehoctor-rag-demo[llamacpp]@latest' chat

Ollama on Linux

Remember that you have to keep Ollama up-to-date manually on Linux. A recent version of Ollama (v0.11.10 or later) is required to run the embedding model we use. See this FAQ: https://docs.ollama.com/faq#how-can-i-upgrade-ollama.

Project feature roadmap

  • ❌ RAG functionality
  • ✅ torch inference via the Langchain local Hugging Face inference integration
  • ✅ uv automatic torch backend selection (see the docs)
  • ❌ OpenAI integration
  • ❌ Anthropic integration

Run from the repository

First, clone this repository. Then, run one of the options below.

Run in a terminal:

uv run chat

Or run in a web browser:

uv run textual serve chat

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jehoctor_rag_demo-0.2.5.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jehoctor_rag_demo-0.2.5-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file jehoctor_rag_demo-0.2.5.tar.gz.

File metadata

  • Download URL: jehoctor_rag_demo-0.2.5.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for jehoctor_rag_demo-0.2.5.tar.gz
Algorithm Hash digest
SHA256 26ad5785413fa5c3af2d1841d91f7f6864bb73f130c718f5d6bff0516aa8c247
MD5 947036f94388bdf721e4bb3594477cc0
BLAKE2b-256 119d86089eca6662d6b8dc28376c5296deea1b3a42bd16a23d5a6b91df2a0eed

See more details on using hashes here.

File details

Details for the file jehoctor_rag_demo-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: jehoctor_rag_demo-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for jehoctor_rag_demo-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 07eb01506bdbd0399c395e1c09aed3921439b2d37519130a4280d759e9527b00
MD5 dc9ad6035d2b921560a08563da5fe902
BLAKE2b-256 e31ff6264ea8b2c6c79b0688d473874fe1c3ffcf52323c83d23377b8567e1290

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page