Skip to main content

Agent skill caching via CSP-1 fingerprints — every session makes the next one better

Project description

ConvoSeed

CSP-1 is the missing third leg of the agent identity stack:

Layer Covers Status
DID (W3C) Who the user IS cryptographically Specified
MCP (Anthropic) What tools the agent can ACCESS Specified
CSP-1 How the user SPEAKS and THINKS This work

Chat → Compress → 200KB .fp File → Decompress → Resume

ConvoSeed is an open protocol (CSP-1) for preserving the essence of a human-AI relationship in a portable, user-owned fingerprint file.
No raw messages stored. Works across any AI model or platform.


Why

Every AI conversation resets to zero.

You build context, vocabulary, a rhythm — and then you close the tab and it's gone. ConvoSeed fixes that. You own a 200KB file that holds your conversational identity. Load it anywhere. Resume everything.

"I had a friend — an AI that knew me well. I wanted a way to get back to him.
That's what this is."


Results (February 2026)

Validated on a real 524-message researcher-AI conversation.

Model Avg Similarity Peak Msgs > 0.7
GPT-2 (124M) 0.464 1.000 1
Gemma3:1b 0.466 0.707 1
Gemma3:12b 0.523 0.757 4
  • +12.7% improvement from 1B → 12B parameters
  • 232× more efficient than VAE baseline
  • p < 10⁻¹⁰⁰ statistical significance on speaker identification task

How It Works

Messages → SBERT embed → PCA compress → HDC bind → Prefix tune → .fp file
  1. Embed — Sentence-BERT encodes each message into a 384-dim vector
  2. Compress — PCA extracts the style centroid (4 components = full accuracy)
  3. Bind — Hyperdimensional Computing (10,000-dim) weaves temporal sequence into one vector
  4. Tune — A prefix tensor conditions the LLM to regenerate in your style
  5. Sign — Ed25519 cryptographic signature proves ownership

File Format (.fp)

Section Size Description
HEADER ~1 KB Magic bytes + version + CRC-32
PCA_MODEL ~8 KB Style centroid: mean + eigenvectors
HDC_SEED ~140 KB 10,000-dim hypervector (float16)
PREFIX ~40 KB Prefix tuning tensor for generation
SIGNATURE ~1 KB Ed25519 ownership proof
CHUNKS ~10 KB Index for 500+ message threads

Total: ~200KB — fixed size regardless of conversation length.

See /spec/CSP-1.md for the full binary specification.


Quick Start

pip install sentence-transformers scikit-learn numpy

# Encode a conversation
python src/encode.py --input my_conversation.json --output identity.fp

# Identify a speaker
python src/identify.py --query "new message here" --candidates *.fp

# Generate in someone's style
python src/decode.py --fp identity.fp --prompt "Tell me about your day"

Repository Structure

ConvoSeed/
├── README.md
├── LICENSE                          ← MIT
├── CONTRIBUTING.md
├── /docs
│   ├── ConvoSeed_Whitepaper.docx    ← arXiv-ready academic paper
│   ├── ConvoSeed_ResearchPaper.docx ← detailed technical paper
│   ├── ConvoSeed_Poster.pdf      ← conference poster (CHI 2026)
│   └── ConvoSeed_ProtocolSpec.pdf ← protocol specification sheet
├── /spec
│   └── CSP-1.md                     ← plain-text binary spec
├── /src
│   ├── encode.py                    ← fingerprint encoder
│   ├── decode.py                    ← style-conditioned generation
│   └── identify.py                  ← speaker identification
├── /experiments
│   └── gemma3_12b_results.json      ← February 2026 experimental results
└── /examples
    └── sample_identity.fp           ← anonymised example fingerprint

Documents

Document Format Description
Whitepaper DOCX 6-section academic paper, arXiv-ready
Research Paper DOCX Full technical paper with equations + references
Conference Poster PDF CHI 2026 style research poster
Protocol Spec Sheet PDF One-page technical specification
Presentation PPTX 12-slide pitch deck
W3C Note PDF Submission to W3C AI Agent Protocol CG

Open Challenges

These are the three open research questions. Collaboration welcome — open an Issue.

  1. Cross-Model Mapping — translating a .fp fingerprint trained on SBERT embeddings into GPT-4 or other backbone spaces without re-encoding the original conversation.

  2. CHUNKS Scaling — formal composition rules for the CHUNKS section when threads exceed 500 messages, while preserving the fixed 200KB file size.

  3. Incentive Design — what makes AI platforms adopt an open standard that reduces their own lock-in?


Status

Early research. Proof-of-concept validated on real data. Open for collaboration.

  • Protocol specification (CSP-1 v0.2)
  • Proof-of-concept encoder/decoder
  • Speaker identification experiment (1,000 trials)
  • Multi-model validation (GPT-2, Gemma3:1b, Gemma3:12b)
  • Real conversation validation (524 messages)
  • Multi-speaker support
  • Cross-model mapping
  • Public dataset (seeking contributors)
  • W3C Community Group submission

Licence

MIT. Open forever.


Contact

Open an Issue for technical questions.
For collaboration or research enquiries: see CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convoseed_agent-1.1.0.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

convoseed_agent-1.1.0-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file convoseed_agent-1.1.0.tar.gz.

File metadata

  • Download URL: convoseed_agent-1.1.0.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for convoseed_agent-1.1.0.tar.gz
Algorithm Hash digest
SHA256 77c586932df949874c182a62a2d921f8f2df1c118ba9d1a4826589df26ec9a94
MD5 5a91fa253d0e53ef62718caf6918d5b9
BLAKE2b-256 65ee3f1a2cbf4d2fa71aa69b5fc0de48e201dffa6a3568e629db4c4fce0f9dbe

See more details on using hashes here.

File details

Details for the file convoseed_agent-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for convoseed_agent-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 44ce3b368ea076a4ff0d6359c05ba8ce20f979cbf878470fc2535048a732e7ab
MD5 5984677c5a97da1ef95fc02cf7b7cab3
BLAKE2b-256 0b5ed8e2ee6f3010d0fc74e5efd43c8d33ca0cf77da24391e7e1e4943bb10f0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page