CiteVahti — citation-integrity and provenance for research synthesis: a blinded human->AI->adjudication dual-rating workflow with decision-gated, undoable Zotero write-back (single-user, local, PubMed-only).
Project description
CiteVahti
A product of Vahtian. (Formerly developed as ZotSynth; fully renamed — the distribution, the importable package, and the CLI are all
citevahtinow — see ADR-0006.)
Status: v0.8.0 — manuscript surfaces. The ADR-0001 evidence-decision ledger is complete end to end (claim → candidate → blinded support rating → final decision → decision-gated, undoable Zotero write → de-identified warehouse), hash-chain audited, with 502 offline tests. The VS Code inline review loop highlights claim spans by state and shows an evidence card with PICO fit-checks, a citation-fit score, and the supporting excerpt; a "Change reference" flow searches PubMed and links new candidates; plus an editor-mode Citation-Integrity Report, an agent-proposes / human-accepts revision diff, and a constrained agent/MCP surface. Local-first, single-user, PubMed-only.
Positioning. CiteVahti is free and local-first for researchers, and Vahtian sells paid infrastructure to organizations that need auditable citation integrity at publication scale. The open Apache-2.0 core never paywalls a researcher's ability to verify their own manuscript; the hosted layer (ADR-0003) serves organizations with citation-risk exposure — publishers, guideline groups, institutions, and medical-communications teams.
A citation-integrity and provenance system for research synthesis. CiteVahti's core value is not autonomous reviewing; it is a documented human → AI → adjudication workflow that can be reported transparently in a methods section. Single-user, local, PubMed-only.
The human or panel is always the decider. The AI is a blinded, advisory second rater only. AI values are advisory, never decisive, and never silently propagated.
▶ New here? docs/QUICKSTART.md — zero to your first
verified citation in ~10 minutes (install → connect Zotero → verify → write).
See docs/ for the architecture, methods, safety invariants, CLI
reference, and the reviewer checklist.
Direction: the citation-integrity ledger (ADR-0001)
As of 0.4.0 the product spine is citation integrity — verify the claim before you cite it. The claim is the first-class object, and the ledger is:
manuscript claim → candidate papers → blinded claim-support rating
→ human-owned final decision → decision-gated, undoable Zotero write → audit
A validated Zotero write happens only as the terminal step of that chain
(one claim · one paper · one final accept decision · provenance · transaction ·
audit · undo) — never silently, never for a paper that doesn't support the claim.
See docs/adr/0001-citation-integrity-architecture.md
for the decision and the local-first build sequence (steps 1–6 complete), and
docs/adr/0002-ui-delivery-and-review-layer.md
for the inline [oo/o/r/d] review-layer UI direction.
What CiteVahti guarantees (read first)
- Zotero local API is read-only / GET-only. CiteVahti never writes to Zotero
through
/api/; all reads go through it and nothing is mutated. - Better BibTeX is the citation engine. Citekey resolution and export run through BBT's JSON-RPC; CiteVahti never invents citekeys.
.citevahti/is the durable state layer. Config, frames, the evidence map, ratings, intake, snapshots, PRISMA ledgers, exports, and a hash-chained audit log all live there — independent of Zotero.- PubMed (NCBI E-utilities) is the only literature-search provider, behind a pluggable interface; it is search-only and never decides inclusion.
- The AI is a blinded, advisory second rater only. It never sees the human value, never decides, and never sets the recorded value.
- The human/panel is always the decider.
- AI values never become
final_valueautomatically. A discordance is resolved only by a human/panel adjudication with a rationale. - Write-back is optional, dry-run-first, token-confirmed, and never silently falls back from the local add-on to the Web API.
- All state mutations are audit-logged in a tamper-evident, hash-chained
audit_log.jsonl. - Unit tests use fake seams and pass fully offline — no live Zotero, BBT, PubMed, or network writes are required to run the suite.
Probe, not proof
The expected runtime (Zotero 9.x local API on macOS, Better BibTeX) is not assumed. On startup CiteVahti probes and caches each capability with a remediation string, and reports a capability available only after a successful probe. The three version types are kept strictly distinct and never confused:
- Zotero app version — from the
x-zotero-versionheader (e.g.9.0.4). - Zotero local-API schema version —
zotero-schema-version(e.g.42); never surfaced as the app version. - Better BibTeX add-on version — from BBT's
api.readyresponse (e.g.9.0.27); read live, never hardcoded, never taken from the app-version header.
localhost is used uniformly (the /api/ path checks Host: localhost:23119).
If a backend is absent, the relevant tools degrade honestly with a remediation
string rather than failing silently or fabricating data.
citevahti init # create the .citevahti/ state layer
citevahti probe # probe Zotero /api/, BBT api.ready, CAYW probe=1
citevahti verify-audit # check the hash-chained audit log
# (the legacy `citevahti` command still works as an alias)
Architecture (three stores + PubMed)
- Zotero local API (read-only) — items / attachments / collections / full text / annotations.
- Better BibTeX (JSON-RPC + CAYW) — stable citekeys, citation insertion, export.
- PubMed via NCBI E-utilities — the only online search provider; search-only.
.citevahti/local state — durable provenance layer with a hash-chained audit log.
Details: docs/ARCHITECTURE.md.
The blinded dual-rating method
Human commits blind → AI rates blind to the human value (may abstain) →
the system compares (concordant→accepted / discordant→needs_adjudication
/ ai_abstained / human_only) → a human/panel adjudicates every
discordance → the recorded final_value is always human/panel-sourced.
This maps onto transparent AI-in-evidence-synthesis reporting (PRISMA 2020 /
PRISMA-trAIce; RAISE; the Cochrane/Campbell/JBI/CEE position on human oversight).
CiteVahti records and reports what was done; it does not claim compliance
with, or endorsement by, any guideline. Full description:
docs/METHODS.md.
Schemes (recorded, not computed)
- Primary: GRADE certainty at the outcome / body-of-evidence level —
High | Moderate | Low | Very Low. - Secondary: RoB 2 / ROBINS-I at the study (or study × outcome) level. ROBINS-I No information is missing-like, not an ordinal point.
CiteVahti records human-chosen values and the AI's blind second rating; it never computes GRADE and never runs RoB signalling questions.
Scope boundary
Owns: citation integrity, citekey/export, annotation provenance, PubMed staging, assistive extraction, claim support, human-chosen quality/GRADE recording, blinded AI second-rating + adjudication records, multi-rater agreement reporting, evidence-map exports, snapshots, corpus diffs, retraction staleness, PRISMA tallying, agreement/provenance reporting, audit, guarded write-back.
Does not: design search strategies, decide inclusions, replace screening platforms, run RoB / ROBINS-I signalling questions, compute GRADE, perform meta-analysis, generate recommendations, or author the review.
Setup
# with uv
uv venv && uv pip install -e ".[dev]"
# or pipx for the CLI
pipx install .
pytest # 502 tests, fully offline
bash scripts/final_smoke.sh # pytest + probe + verify-audit (no writes)
# build + install the VS Code inline review extension
cd vscode-extension && npm install && npm run package
code --install-extension citevahti-0.8.0.vsix
Config via environment (NCBI_EMAIL, NCBI_API_KEY) + .citevahti/config.json.
CLI reference: docs/CLI.md. Full walk-through (zero → first
verified citation): docs/QUICKSTART.md.
Try it (what to do)
A five-minute path through the inline review layer. Full version with copy-paste
commands: docs/QUICKSTART.md.
- Install + build the extension — the two blocks under Setup
(
pip install -e, thennpm run package+code --install-extension). In VS Code, setcitevahti.cliPathto yourcitevahtibinary (e.g..venv/bin/citevahti). - Create a project + connect Zotero (optional, for write-back):
citevahti init citevahti onboard --ncbi-email you@uni.edu --no-zotero-key --skip-validate citevahti connect-zotero # one-paste key flow; stored in your OS keychain
- Add a claim from your manuscript, find evidence, link it:
citevahti claim-add --text "Low-dose CT screening reduces lung-cancer mortality in high-risk populations." --type effectiveness citevahti literature-search --query "low-dose CT lung cancer screening mortality randomized" --question-id q1 citevahti claim-link-candidates --claim-id <CLAIM_ID> --intake-batch-id <BATCH_ID>
- Review in VS Code: open the manuscript, run Command Palette →
“CiteVahti: Verify claims.” Claims are highlighted by state. Expand one,
focus a candidate, and:
- read the evidence card — supporting excerpt, PICO fit-checks
(Population / Intervention / Outcome / Claim), and the citation-fit score
(
n/8, Strong / Moderate / Weak); - press the verdict —
o oaccept,ocaution,rreview,dreject. (The AI's rating stays hidden until you rate — blinding is real.) - on a weak claim, click “⇄ Change reference…” to search PubMed and add a better-fitting paper as a new candidate;
- on an accepted candidate, click “✓ Add to Zotero” → preview → confirm → done, with Undo.
- read the evidence card — supporting excerpt, PICO fit-checks
(Population / Intervention / Outcome / Claim), and the citation-fit score
(
Nothing is written to Zotero, and no claim text is edited, without an explicit confirm — every write is previewed, audited, and undoable.
What to test
To verify a checkout behaves as documented:
pytest # 502 offline tests — no Zotero/BBT/PubMed/network needed
bash scripts/final_smoke.sh # pytest + probe + verify-audit, no writes
cd vscode-extension && npm install && npm run compile && npm run package # extension builds → .vsix
Then a manual acceptance pass in VS Code (after CiteVahti: Verify claims):
- Highlighting — each claim is decorated by its state (
oo / o / r / d), and the overview ruler shows the same colors. - Blinding — before you rate, the card shows the AI support as hidden; it appears only after you commit your own rating.
- Evidence card — a rated candidate shows the excerpt, the four
PICO fit-checks, and a citation-fit score (
n/8). - Keyboard verdicts —
o orecordsaccept,ocaution,rreview,dreject; each prompts for an audit reason. - Change reference — “⇄ Change reference…” runs a PubMed search, lets you pick results, and the new candidates appear on the claim after refresh.
- Write-back — “✓ Add to Zotero” previews the change and asks to confirm; after committing, the Undo action removes it again.
Safety invariants are also asserted by the suite —
docs/SAFETY_INVARIANTS.md and
docs/REVIEW_CHECKLIST.md.
Build status
Built in nine reviewed steps; see CHANGELOG.md. Every step is a
separate branch with its own commit. Safety invariants are enforced in code and
asserted by the test suite — see docs/SAFETY_INVARIANTS.md
and docs/REVIEW_CHECKLIST.md.
License
Apache License 2.0 — see LICENSE and NOTICE. The library, CLI, MCP agent surface, and VS Code extension are all Apache-2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file citevahti-0.8.0.tar.gz.
File metadata
- Download URL: citevahti-0.8.0.tar.gz
- Upload date:
- Size: 160.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5f4d8223e8daa107d4cd26f08b7ab92e7ba9b7d14d1a26becde0e6ef64d1c56
|
|
| MD5 |
47a5a25672cd660b13b82f8cc055cd37
|
|
| BLAKE2b-256 |
f9799e5b68ec3ca6adeb925beeecf4cd7cfbb773345237de38ec007b54f13c5e
|
File details
Details for the file citevahti-0.8.0-py3-none-any.whl.
File metadata
- Download URL: citevahti-0.8.0-py3-none-any.whl
- Upload date:
- Size: 203.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53ba1ac4161c0a044db7dfeba8adbaee9ca714684b8b6a6b14dab65e03e8ad1a
|
|
| MD5 |
c51f5ccd22134a9e3c4698b4c782722c
|
|
| BLAKE2b-256 |
29a2304358a6674309b55292135531ceaca041fd0c43be94b07c3025ab35ce78
|