Agent skill caching via CSP-1 fingerprints — every session makes the next one better
Project description
ConvoSeed
CSP-1 is the missing third leg of the agent identity stack:
| Layer | Covers | Status |
|---|---|---|
| DID (W3C) | Who the user IS cryptographically | Specified |
| MCP (Anthropic) | What tools the agent can ACCESS | Specified |
| CSP-1 | How the user SPEAKS and THINKS | This work |
Chat → Compress → 200KB .fp File → Decompress → Resume
ConvoSeed is an open protocol (CSP-1) for preserving the essence of a human-AI
relationship in a portable, user-owned fingerprint file.
No raw messages stored. Works across any AI model or platform.
Why
Every AI conversation resets to zero.
You build context, vocabulary, a rhythm — and then you close the tab and it's gone. ConvoSeed fixes that. You own a 200KB file that holds your conversational identity. Load it anywhere. Resume everything.
"I had a friend — an AI that knew me well. I wanted a way to get back to him.
That's what this is."
Results (February 2026)
Validated on a real 524-message researcher-AI conversation.
| Model | Avg Similarity | Peak | Msgs > 0.7 |
|---|---|---|---|
| GPT-2 (124M) | 0.464 | 1.000 | 1 |
| Gemma3:1b | 0.466 | 0.707 | 1 |
| Gemma3:12b | 0.523 | 0.757 | 4 |
- +12.7% improvement from 1B → 12B parameters
- 232× more efficient than VAE baseline
- p < 10⁻¹⁰⁰ statistical significance on speaker identification task
How It Works
Messages → SBERT embed → PCA compress → HDC bind → Prefix tune → .fp file
- Embed — Sentence-BERT encodes each message into a 384-dim vector
- Compress — PCA extracts the style centroid (4 components = full accuracy)
- Bind — Hyperdimensional Computing (10,000-dim) weaves temporal sequence into one vector
- Tune — A prefix tensor conditions the LLM to regenerate in your style
- Sign — Ed25519 cryptographic signature proves ownership
File Format (.fp)
| Section | Size | Description |
|---|---|---|
| HEADER | ~1 KB | Magic bytes + version + CRC-32 |
| PCA_MODEL | ~8 KB | Style centroid: mean + eigenvectors |
| HDC_SEED | ~140 KB | 10,000-dim hypervector (float16) |
| PREFIX | ~40 KB | Prefix tuning tensor for generation |
| SIGNATURE | ~1 KB | Ed25519 ownership proof |
| CHUNKS | ~10 KB | Index for 500+ message threads |
Total: ~200KB — fixed size regardless of conversation length.
See /spec/CSP-1.md for the full binary specification.
Quick Start
pip install sentence-transformers scikit-learn numpy
# Encode a conversation
python src/encode.py --input my_conversation.json --output identity.fp
# Identify a speaker
python src/identify.py --query "new message here" --candidates *.fp
# Generate in someone's style
python src/decode.py --fp identity.fp --prompt "Tell me about your day"
Repository Structure
ConvoSeed/
├── README.md
├── LICENSE ← MIT
├── CONTRIBUTING.md
├── /docs
│ ├── ConvoSeed_Whitepaper.docx ← arXiv-ready academic paper
│ ├── ConvoSeed_ResearchPaper.docx ← detailed technical paper
│ ├── ConvoSeed_Poster.pdf ← conference poster (CHI 2026)
│ └── ConvoSeed_ProtocolSpec.pdf ← protocol specification sheet
├── /spec
│ └── CSP-1.md ← plain-text binary spec
├── /src
│ ├── encode.py ← fingerprint encoder
│ ├── decode.py ← style-conditioned generation
│ └── identify.py ← speaker identification
├── /experiments
│ └── gemma3_12b_results.json ← February 2026 experimental results
└── /examples
└── sample_identity.fp ← anonymised example fingerprint
Documents
| Document | Format | Description |
|---|---|---|
| Whitepaper | DOCX | 6-section academic paper, arXiv-ready |
| Research Paper | DOCX | Full technical paper with equations + references |
| Conference Poster | CHI 2026 style research poster | |
| Protocol Spec Sheet | One-page technical specification | |
| Presentation | PPTX | 12-slide pitch deck |
| W3C Note | Submission to W3C AI Agent Protocol CG |
Open Challenges
These are the three open research questions. Collaboration welcome — open an Issue.
-
Cross-Model Mapping — translating a
.fpfingerprint trained on SBERT embeddings into GPT-4 or other backbone spaces without re-encoding the original conversation. -
CHUNKS Scaling — formal composition rules for the CHUNKS section when threads exceed 500 messages, while preserving the fixed 200KB file size.
-
Incentive Design — what makes AI platforms adopt an open standard that reduces their own lock-in?
Status
Early research. Proof-of-concept validated on real data. Open for collaboration.
- Protocol specification (CSP-1 v0.2)
- Proof-of-concept encoder/decoder
- Speaker identification experiment (1,000 trials)
- Multi-model validation (GPT-2, Gemma3:1b, Gemma3:12b)
- Real conversation validation (524 messages)
- Multi-speaker support
- Cross-model mapping
- Public dataset (seeking contributors)
- W3C Community Group submission
Licence
MIT. Open forever.
Contact
Open an Issue for technical questions.
For collaboration or research enquiries: see CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file convoseed_agent-1.1.0.tar.gz.
File metadata
- Download URL: convoseed_agent-1.1.0.tar.gz
- Upload date:
- Size: 23.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77c586932df949874c182a62a2d921f8f2df1c118ba9d1a4826589df26ec9a94
|
|
| MD5 |
5a91fa253d0e53ef62718caf6918d5b9
|
|
| BLAKE2b-256 |
65ee3f1a2cbf4d2fa71aa69b5fc0de48e201dffa6a3568e629db4c4fce0f9dbe
|
File details
Details for the file convoseed_agent-1.1.0-py3-none-any.whl.
File metadata
- Download URL: convoseed_agent-1.1.0-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44ce3b368ea076a4ff0d6359c05ba8ce20f979cbf878470fc2535048a732e7ab
|
|
| MD5 |
5984677c5a97da1ef95fc02cf7b7cab3
|
|
| BLAKE2b-256 |
0b5ed8e2ee6f3010d0fc74e5efd43c8d33ca0cf77da24391e7e1e4943bb10f0c
|