Per-author voice calibration: score AI tells and rewrite drafts to your natural voice.
Project description
Status: alpha (v0.1.0). Score + iterative rewriter + incremental corpus capture are working. PyPI release pending. Use as a Python library or via the
callusCLI.
Why this exists
I write in English as a second language. I tested four commercial AI detectors on my published blog posts. Each of them returned over 90% AI on prose I had written, edited, and corrected myself. A peer-reviewed paper from Stanford in 2023 explained why: detectors based on perplexity flag non-native English writers at 61% false positive, because the same features that mark "AI-like text" — limited vocabulary, common collocations — also describe how most non-native speakers write.
So the score number was useless to me as a metric for iteration. If I rewrote a paragraph to remove the actual AI tells — the aphorisms, the hinge phrases, the triplet negations — the score barely moved. The detector was measuring my passport, not my prose.
callus does the other thing. It compares your draft against your own raw writing — extracted from your Claude Code sessions or any other source where you typed unedited — and against a small library of AI tells. The score is "how far is this draft from your voice + how dense are the tells", not "what is the probability this came from an LLM". That distinction is the whole point.
What it does
Three operations, all calibrated against you, not against a generic native-English baseline:
callus score <file>— Returns a 0-100 score with a per-axis breakdown (voice_distance, tells_density, structural_ai_patterns), the language detected (EN/ES), and concrete tells cited verbatim from the draft with suggested fixes.callus rewrite <file> --target 25— Iteratively rewrites the draft using your voice corpus as few-shot context. Stops when it hits the target or starts degrading (early-stop on degrade). Preserves claims, numbers, and links; allows paragraph restructuring. Typical run: 2-3 iterations, $0.01-0.02 USD on Haiku.callus build-corpus --source <dir>— Extracts your raw user-typed prompts from Claude Code session logs and applies thirteen calibrated filters (drops pastes, command dumps, Codex reviews, dashboard copy, emoji-heavy reviewer output) so what ends up in the corpus is actually your voice, not your assistant's.
A fourth piece, callus approve <pending.md>, merges new candidates from incremental capture (see hook setup) after you mark each one OK / NO / MEH.
Quick start
pip install -e . # PyPI pending; for now install editable
callus --version
# Extract your raw voice from Claude Code sessions
callus build-corpus --source ~/.claude/projects/your-project
# Score a draft
callus score path/to/draft.md
# Rewrite a draft toward your voice
callus rewrite path/to/draft.md --target 25 --out path/to/draft.rewritten.md
You need a working claude CLI on your PATH (the package shells out to claude -p --model haiku for scoring and rewriting).
Why not just use GPTZero / Originality / Humalingo?
I ran the same blog post through Humalingo. It scored 91% AI. I ran the LessWrong submission of the same content, which has measurably more AI tells (hinge phrases, triplet negations, defensive clarifications), through Humalingo as well. It scored 92%.
A one-point delta between a draft with four BLOCK-severity tells and a draft with one. The classifier cannot see the difference between "clean voice" and "voice plus tells" within the cluster of AI-assisted writing. The custom judge in callus scored the same two drafts at 20 and 33 — a thirteen-point delta that maps onto what a human moderator would actually read for.
The Stanford 2023 result on non-native English bias (arXiv:2304.02819) explains why. Classifier-based detectors lean on perplexity, which is an artifact of vocabulary and collocation distribution. Native English essayists and AI both share a higher-perplexity distribution. Non-native writers and AI both share a lower-perplexity distribution. The detector cannot tell them apart structurally.
callus does not try to. It measures something else: distance from a specific writer's voice, defined by that writer's own raw text. There is no claim of universality; there is a claim of usefulness to the operator.
If you want to score against the generic native-English baseline, use Humalingo. If you want to iterate on a draft so it reads more like the way you actually write, use this.
More detail in docs/why_not_classifier.md.
Setting up your voice
callus ships without a corpus on purpose. The whole architecture only works if the corpus is yours.
- Build the corpus from your Claude Code sessions:
callus build-corpus --source <path>. - Sample-review the first 16 entries by hand. The filters drop most contamination but you should know what is in your corpus.
- Write a voice profile by copying
cookbook/profile_template.mdand editing the rules to match how you actually write. The defaulttells_ai.mdis generic; the profile is yours. - Score and iterate.
Full walkthrough: docs/setup_your_voice.md.
Incremental capture (optional)
If you want the corpus to grow automatically every time you close a session in Claude Code, wire a hook:
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "python /path/to/callus/callus/hook_close.py 2>/dev/null || true"
}
]
}
]
The hook watches for closing phrases ("cerramos", "guardar memoria", "listo por hoy", "session close") in your prompts. When it sees one, it extracts the session's user messages, applies the same thirteen filters as build-corpus, deduplicates against your existing corpus, and writes a pending review file. Nothing gets merged without you running callus approve.
How it is built
┌──────────────────────────────────────┐
│ callus.score │
│ LLM-as-judge, multi-axis, EN+ES │
└───────────────┬──────────────────────┘
│
┌───────────────────────┴─────────────────────────┐
│ │
┌───────▼────────┐ ┌─────────▼──────────┐
│ callus.rewrite │ ←── few-shot voice ──→ │ callus.build_corpus│
│ iterative │ voice_corpus.jsonl │ F1-F13 filters │
│ loop │ │ (calibrated) │
└────────────────┘ └─────────────────────┘
The judge prompt sees four things on every call: your voice profile, a generic tells_ai library, six rotating raw-voice samples from your corpus, and the draft. It returns strict JSON with axis scores and verbatim citations. The rewriter feeds those citations back into a follow-up call that asks the LLM to produce a voice-translated draft while preserving every quantitative claim and link.
The bias correction for non-native EN is built into the prompt instructions, not as a post-hoc adjustment.
When NOT to use callus
- You do not have a corpus of your own writing. The skill is calibration against you; without a corpus, you are scoring against nothing.
- You want a generic "is this AI" detector for a third party's writing. Use a commercial classifier; that is what they are calibrated for.
- The draft is shorter than a hundred words. The signal-to-noise ratio is too low; iterate by hand.
Roadmap
- PyPI release of v0.1.0
- Hooks for closing-session detection across editors beyond Claude Code
- Embeddings-based similarity layer as an optional add-on for stronger personal calibration
- Multilingual corpus mixing rules (current default is single-language per corpus)
Contributing
Issues and pull requests welcome. The interesting work right now is on calibrating the F-filters for other languages and on writing more eval sets so the rewriter's convergence behavior can be measured across more domains.
git clone https://github.com/VDP89/callus
cd callus
pip install -e ".[dev]"
pytest -q
License
Apache 2.0 — see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file callus-0.2.0.tar.gz.
File metadata
- Download URL: callus-0.2.0.tar.gz
- Upload date:
- Size: 41.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d269df85d02b45cef3ea0bb3047a55335009e29074602ed0fff87acc82c3ab4
|
|
| MD5 |
a03e471d76b355f794ae657fb3f706b9
|
|
| BLAKE2b-256 |
8139608febbfd8c8dadd9a970019bd2d1436c6e4be760f5bccd35b4f77d942ad
|
Provenance
The following attestation bundles were made for callus-0.2.0.tar.gz:
Publisher:
release.yml on VDP89/callus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
callus-0.2.0.tar.gz -
Subject digest:
6d269df85d02b45cef3ea0bb3047a55335009e29074602ed0fff87acc82c3ab4 - Sigstore transparency entry: 1655199205
- Sigstore integration time:
-
Permalink:
VDP89/callus@d7d04186e2cf32b934987e9335e5074f6332e006 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/VDP89
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d7d04186e2cf32b934987e9335e5074f6332e006 -
Trigger Event:
push
-
Statement type:
File details
Details for the file callus-0.2.0-py3-none-any.whl.
File metadata
- Download URL: callus-0.2.0-py3-none-any.whl
- Upload date:
- Size: 40.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f90e4244d7343bd3627b0395289418d1de594622bb8296501363141e4610be2
|
|
| MD5 |
1a929d9b0b6d83472d7ba9599010f404
|
|
| BLAKE2b-256 |
2bbd114c0c21785cc6696f9d192584e703e068296130309b9cca4ea392fe267a
|
Provenance
The following attestation bundles were made for callus-0.2.0-py3-none-any.whl:
Publisher:
release.yml on VDP89/callus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
callus-0.2.0-py3-none-any.whl -
Subject digest:
4f90e4244d7343bd3627b0395289418d1de594622bb8296501363141e4610be2 - Sigstore transparency entry: 1655199283
- Sigstore integration time:
-
Permalink:
VDP89/callus@d7d04186e2cf32b934987e9335e5074f6332e006 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/VDP89
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d7d04186e2cf32b934987e9335e5074f6332e006 -
Trigger Event:
push
-
Statement type: