Local, privacy-first cited-answer + disagreement engine: cites its sources, then shows where they agree and contradict. RAG over a local Ollama model.

These details have not been verified by PyPI

Project links

Project description

Gainsay

A local engine for intellectual due process.

Most answer engines are built to tell you what is true. Gainsay is built to show you how a conclusion was reached — and to leave you free to disagree with it. It runs entirely on your own machine, cites every claim back to the source it came from, and surfaces where those sources contradict one another. The promise is not a better answer; it is an answer you can inspect, reproduce, and contest — without having to trust anyone, including us.

Trust the process enough that you never have to trust the authority.

What Gainsay promises

These are guarantees, not features. They do not depend on any particular implementation, and the project binds itself to uphold them — see the Constitution:

The evidence is inspectable — every claim carries a tag back to the exact source it came from.
Disagreement is surfaced — where sources conflict, the conflict is shown, not smoothed away.
The weighing is visible — you can see why a conclusion currently prevails, not merely that it does.
Every conclusion is contestable — no verdict is final; enough is shown for you to challenge it.
The process is reproducible — the same question over the same evidence yields a path you can re-walk, locally.
Change is traceable — when the evidence shifts, the answer has a history, not only a present.

What Gainsay does not promise

It does not claim to tell you what is true, to be a final authority, or to be correct because it is confident. It can be wrong. The point was never that it does not err — it is that when it does, the path to seeing and correcting the error stays open. Every conclusion stays challengeable.

How to verify these guarantees yourself

You should not have to take the promises on faith — that would defeat the purpose. Each is checkable:

Inspectable / surfaced / weighed — every answer prints its sources [W#] [B#] [S#] and a per-claim support-vs-contradiction panel. Read them; nothing is asserted that isn't traceable.
Contestable — run --verify and read where the engine flags its own answer as unsupported or contradicted by its sources.
Reproducible — ask the same question twice, or run --no-web for a fully-offline, local-only path you can re-walk.
Private — every web fetch is logged locally to ~/.gainsay/web-audit/; inspect exactly what touched the network. Nothing else leaves your machine.
Governed — read the Constitution and hold the project to it. Every change is reviewed against one question: does this strengthen or weaken the guarantees above?

How it works

question
   |
   v
decompose (optional, "deep" mode) ----> sub-questions
   |
   v
retrieve from each enabled tier:
   web search  [W#]      (privacy-routed; query-only egress)
   your library [B#]     (local vector store over YOUR documents)
   scholarly   [S#]      (OpenAlex / Semantic Scholar / arXiv)
   |
   v
rerank (LLM-as-judge, retrieve-many-then-rerank, diversity-capped)
   |
   v
synthesize a cited answer (local model, tools off, sources fenced as data)
   |
   v
verify / disagreement engine: per-claim support vs. contradiction across sources
   |
   v
cited, reranked, contradiction-checked answer

The tiers

Tag	Tier	Trust	Notes
`[W#]`	Live web search	Untrusted	Only the search query leaves your machine.
`[B#]`	Your own library	Trusted	A local vector store over documents you index.
`[S#]`	Scholarly sources	Untrusted	OpenAlex / Semantic Scholar / arXiv connectors.

Gainsay ships with an empty library index. The [B#] tier is a capability, not a bundled corpus — you decide what goes in it (see "Index your own documents" below).

How the guarantees are kept

The promises above are enforced structurally, not by good intentions:

Cited by construction (Guarantee 1). Every claim carries a tag back to the exact source it came from. A structural check rejects answers that cite sources that were never retrieved — so a citation cannot be fabricated.
The disagreement engine (Guarantees 2–4). After the answer is written, Gainsay extracts its load-bearing claims and, for each, shows which retrieved sources support it, which contradict it, and where they conflict — the thing a single hosted model structurally cannot do, because it has no persistent multi-source corpus to cross-check against. The analysis is computed live and deliberately not persisted as a confidence score (a model's confidence guess, written to disk, just calcifies a guess into a "fact").
Tools-off synthesis (protects every guarantee). Retrieved web text is third-party content and a prime indirect-prompt-injection vector (OWASP LLM01). The structural guarantee — which holds even though this is open source — is that all retrieved text is fenced inside an explicit data boundary and synthesized with tools disabled: the worst a poisoned page can do is skew an answer, never trigger an action. Layered on top are best-effort heuristics (homoglyph folding, defanging, an injection tripwire) — treat those as a tripwire to extend for your own threat model, not a guarantee.
Local and private (Guarantee 5 + privacy). Synthesis, embeddings, and reranking all run against a local Ollama server. The only thing that leaves the machine is the keyword search query — and you can disable even that with --no-web.

Requirements

Python 3.10+
A local Ollama install, with:
- a chat model for synthesis (default gpt-oss:20b)
- an embedding model for the library tier (e.g. nomic-embed-text)
pip install gainsay pulls in the helpful extras automatically (numpy for fast search, beautifulsoup4 for HTML cleaning, charset-normalizer for encoding). The core loop runs on the standard library alone; offline translation is the one opt-in extra (pip install "gainsay[translate]").

# 1. install Gainsay
pip install gainsay

# 2. install Ollama from https://ollama.com, then pull the models:
ollama pull gpt-oss:20b
ollama pull nomic-embed-text

Prefer running from a clone without installing? Use python -m gainsay "your question" in place of the gainsay command shown below.

Usage

# ask a question (web + your library, reranked, cited)
gainsay "what is retrieval-augmented generation?"

# control how much evidence to pull
gainsay --web 6 --books 4 "explain RRF rank fusion"

# library only (fully offline; nothing leaves your machine)
gainsay --no-web "what does my style guide say about headings?"

# web only (skip the library tier)
gainsay --no-books "latest stable release of sqlite"

# deep mode: decompose the question into sub-questions first
gainsay --deep "compare two approaches to vector search"

# turn on the disagreement engine explicitly
gainsay --verify "is X true?"

# add the scholarly tier
gainsay --scholar "evidence for diffusion model guidance scaling"

# machine-readable output (for scripting / integration)
gainsay --json "your question"

Common flags:

Flag	Effect
`--web N`	Number of web results to search (default 5).
`--books N`	Number of library passages to pull (default 4).
`--fetch-top N`	How many top web results to fully fetch (default 3).
`--no-web`	Library only (fully offline).
`--no-books`	Web only.
`--deep`	Agentic query decomposition.
`--verify`	Run the disagreement / contradiction engine.
`--scholar`	Enable the scholarly tier.
`--no-rerank`	Disable LLM-as-judge reranking.
`--model NAME`	Override the synthesis model.
`--json`	Emit machine-readable JSON.

There is also a streaming web UI:

gainsay-web
# then open the printed local URL in your browser

Index your own documents

Gainsay ships with an empty index. The library ([B#]) tier becomes useful once you point it at documents you own. Indexing reads your files, splits them into chunks, embeds each chunk with your local Ollama embedding model, and stores the vectors in a local sqlite database — nothing is uploaded.

from gainsay import rag

# index a folder (or a single file) of YOUR documents
rag.index_path(r"/path/to/your/documents")

# sanity-check what retrieval returns
for chunk in rag.search("a question about your documents", k=5):
    print(chunk["path"], "->", chunk["text"][:120])

Re-run index_path whenever your documents change; indexing is incremental. Supported inputs include plain text, Markdown, and HTML (HTML is stripped to text before embedding).

Bring your own corpus. Gainsay does not bundle any copyrighted material. Point it at public-domain texts, your own notes, or documents you are licensed to use.

Configuration

Gainsay reads a few optional environment variables:

Variable	Purpose	Default
`GAINSAY_MODEL`	Ollama model used for synthesis.	`gpt-oss:20b`
`GAINSAY_RERANK_MODEL`	Ollama model for reranking (falls back to `GAINSAY_MODEL`).	`GAINSAY_MODEL`
`GAINSAY_RERANK`	Set `0` to disable LLM reranking.	on
`GAINSAY_HYBRID`	Set `0` to disable hybrid (BM25 + embedding) retrieval.	on
`TRANSLATE_MODEL`	Ollama model for translating foreign-language sources.	`qwen3:14b`

The synthesis model defaults to gpt-oss:20b. On a smaller machine you can set GAINSAY_MODEL to a lighter model like llama3.1:8b. The library index path defaults to a per-user application-data directory and can be left as-is for a single-user install.

Privacy model

The only outbound network traffic from the core loop is the keyword search query sent to a web search backend. You can disable even that with --no-web for fully-offline, library-only answers.
Your full question, the retrieved passages, the synthesized answer, and the disagreement analysis are all produced by your local model and never uploaded.
Web and scholarly sources are treated as untrusted data. Their text is defanged and fenced, and an injection tripwire flags suspicious passages so you can judge them with extra suspicion.

Acknowledgments

Gainsay was built with the help of others, and shaped by one teacher's example.

Tina Huang — data scientist and educator, whose teaching on verifying AI output rather than taking it at face value was the encouragement behind this project. Gainsay turns that lesson into structure: it cites every claim and cross-checks the answer against its own sources. Homage only — she has not reviewed or endorsed Gainsay. (youtube.com/@TinaHuang1)
Claude (Anthropic) — assisted with the engineering, the design discussions, and the drafting of this project's Constitution and documentation.
ChatGPT (OpenAI) — a second perspective for reviewing code and pressure-testing decisions along the way.

These AI tools were collaborators in the work, not its authority. Every decision about what the project promises — and what it refuses to promise — was the author's.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.2

Jun 26, 2026

This version

1.0.1

Jun 26, 2026

1.0.0

Jun 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gainsay-1.0.1.tar.gz (72.5 kB view details)

Uploaded Jun 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gainsay-1.0.1-py3-none-any.whl (74.3 kB view details)

Uploaded Jun 26, 2026 Python 3

File details

Details for the file gainsay-1.0.1.tar.gz.

File metadata

Download URL: gainsay-1.0.1.tar.gz
Upload date: Jun 26, 2026
Size: 72.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for gainsay-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`1478a090711d1ffce8822a978aa860fd3a5848ddb531eab8fe61927a4cc622de`
MD5	`204ef1c63a2b2dd9f96fe122402b6b6f`
BLAKE2b-256	`0ea61580e64c02f91b80e0389c091bf282bf86c1de580c2294332b7d0379a3de`

See more details on using hashes here.

File details

Details for the file gainsay-1.0.1-py3-none-any.whl.

File metadata

Download URL: gainsay-1.0.1-py3-none-any.whl
Upload date: Jun 26, 2026
Size: 74.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for gainsay-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4164b3635e66a3ef209a400cc7fe74c48774a0fbb44bc4dddf7bfb123eafc3a`
MD5	`0808a67527352f33cd7e6accc4a134fa`
BLAKE2b-256	`7374b35fac5e70767b6dd0b51216ba846fd7a1c9e6af6f528dc469ef1c5b1a47`

See more details on using hashes here.

gainsay 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Gainsay

What Gainsay promises

What Gainsay does not promise

How to verify these guarantees yourself

How it works

The tiers

How the guarantees are kept

Requirements

Usage

Index your own documents

Configuration

Privacy model

Acknowledgments

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes