Generate AI-agent VOICE.md files from website copy and CTAs.

These details have not been verified by PyPI

Project description

site2voice

Generate VOICE.md from any website.

site2voice reads website copy and writes a small Markdown brief that tells an AI coding agent how the site sounds: headings, CTAs, navigation labels, sentence shape, repeated vocabulary, and claim boundaries.

pipx install site2voice

site2voice https://example.com --out VOICE.md

From a repo clone, run the included benchmark fixture:

site2voice examples/saas-home.html --format json
site2voice bench examples/editorial-home.html examples/before-copy.md examples/after-copy.md

Why

DESIGN.md helps agents stop guessing visual style. VOICE.md helps them stop guessing copy style.

Drop the generated file into a project and tell the agent:

Use @VOICE.md for landing-page copy, headings, CTAs, and UI microcopy.

Output

# VOICE.md

## Voice Summary

- Overall tone: explanatory, action-oriented, trust-forward.
- Sentence shape: about 20.4 words per sentence.
- Main vocabulary: `teams`, `security`, `pricing`, `launch`.
- Common CTAs: `Start free`, `Book a demo`, `See pricing`.

## Agent Rules

- Start with a concrete user outcome before describing implementation details.
- Prefer short active sentences and visible verbs from the CTA list.
- Do not invent compliance, security, customer, or performance claims.

The real output also includes a small style fingerprint for heading length, paragraph rhythm, CTA shape, CTA verbs, and lexical variety.

What It Does

Reads a URL or local HTML file.
Extracts title, meta description, headings, links, buttons, and paragraphs.
Finds CTA candidates from short action-led links/buttons.
Measures average sentence length.
Extracts a compact style fingerprint: heading shape, paragraph rhythm, CTA shape, CTA verbs, and lexical variety.
Builds a repeated-vocabulary lexicon.
Writes Markdown or JSON.
Benchmarks candidate copy against a source voice profile.
Gates against unsupported claims and copied spans.
Uses only the Python standard library.

Benchmark

site2voice bench compares candidate copy against measurable source signals: sentence length, vocabulary overlap, CTA shape, tone labels, heading shape, claim boundaries, and copy safety.

site2voice bench examples/editorial-home.html \
  examples/before-copy.md \
  examples/after-copy.md \
  --out examples/editorial-benchmark.md

Candidate	Result	Overall	Lexicon	Copy safety
`after-copy`	PASS	83.8	70.0	93.2
`before-copy`	FAIL	36.6	0.0	100.0

The benchmark rewards measurable voice alignment without rewarding verbatim copying.

What It Is Not

Not an official brand guideline.
Not a DESIGN.md visual-token extractor.
Not a crawler for private pages or authenticated apps.
Not an LLM prompt that copies a site's prose.

Develop

python3 -m pip install -e .
make test
make bench
site2voice examples/saas-home.html --out examples/saas-VOICE.md

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.6

May 21, 2026

0.5.5

May 21, 2026

0.5.4

May 20, 2026

0.5.3

May 20, 2026

0.5.2

May 20, 2026

0.5.1

May 20, 2026

0.5.0

May 20, 2026

0.4.0

May 20, 2026

0.3.0

May 20, 2026

This version

0.2.1

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

site2voice-0.2.1.tar.gz (20.5 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

site2voice-0.2.1-py3-none-any.whl (12.6 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file site2voice-0.2.1.tar.gz.

File metadata

Download URL: site2voice-0.2.1.tar.gz
Upload date: May 20, 2026
Size: 20.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for site2voice-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`37391d98e545e97e74b13aff6fe958f9420a6adda7704cb6b8f8c5f6426f236a`
MD5	`afdd5beb658e8219c8e9452e5c7563c6`
BLAKE2b-256	`1a6f2da776b5e2646ad37f662083af19330c955b27f7478c47cd33ac9148ed39`

See more details on using hashes here.

Provenance

The following attestation bundles were made for site2voice-0.2.1.tar.gz:

Publisher: publish.yml on SihyeonJeon/site2voice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: site2voice-0.2.1.tar.gz
- Subject digest: 37391d98e545e97e74b13aff6fe958f9420a6adda7704cb6b8f8c5f6426f236a
- Sigstore transparency entry: 1579251884
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: SihyeonJeon/site2voice@32ddcac39942b27091ccfd86eb663b8dec2b659d
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/SihyeonJeon
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@32ddcac39942b27091ccfd86eb663b8dec2b659d
- Trigger Event: release

File details

Details for the file site2voice-0.2.1-py3-none-any.whl.

File metadata

Download URL: site2voice-0.2.1-py3-none-any.whl
Upload date: May 20, 2026
Size: 12.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for site2voice-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`95fe10a89819e7a879f7fd0354b1b41f8e53afa3eda0ab19ae04fc5d5d060ab2`
MD5	`f363c31f340a5e9fa6114059aba830d2`
BLAKE2b-256	`aaa6befd40371f3018f8b087a4683960d2dcd336f33ea85ad8f00825f4c083fd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for site2voice-0.2.1-py3-none-any.whl:

Publisher: publish.yml on SihyeonJeon/site2voice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: site2voice-0.2.1-py3-none-any.whl
- Subject digest: 95fe10a89819e7a879f7fd0354b1b41f8e53afa3eda0ab19ae04fc5d5d060ab2
- Sigstore transparency entry: 1579252256
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: SihyeonJeon/site2voice@32ddcac39942b27091ccfd86eb663b8dec2b659d
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/SihyeonJeon
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@32ddcac39942b27091ccfd86eb663b8dec2b659d
- Trigger Event: release

site2voice 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

site2voice

Why

Output

What It Does

Benchmark

What It Is Not

Develop

Links

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance