Skip to main content

A simple utility to generate unique, human-readable identifiers.

Project description

id-phrase

Generate human-readable, LLM-friendly identifier phrases like sharp-spirited-existence-052e.

Installation

pip install id-phrase

Or from source:

git clone https://github.com/afourney/id-phrase
cd id-phrase
uv venv --python python3.11
source venv/bin/activate
uv sync --all-extras

Usage

from id_phrase import generate_id_phrase

print(generate_id_phrase())
# → sharp-spirited-existence-052e

# Override the template:
print(generate_id_phrase("A-N-B"))
# → sharp-existence-a3

# Or supply a list — one is chosen uniformly at random per call:
print(generate_id_phrase(["A-A-N", "A-V-N"]))

Why?

LLMs struggle to accurately copy high-entropy strings like UUIDs and hex identifiers. When asked to reference or reproduce an identifier like a90d0d7d-9c5a-44de-8d3c-5b0da661de7c, models make typos, truncate values, or hallucinate entirely different IDs.

The root cause is tokenization: a UUID consumes ~23 tokens because hex characters and hyphens don't align with the natural language patterns tokenizers are optimized for. Word-based identifiers, by contrast, are tokenized as whole familiar words, making them far more reliably reproduced.

This library generates identifiers in the style of sharp-spirited-existence-052e — a short random hex suffix for uniqueness, combined with natural English words that LLMs handle well.

The evidence

  • BAML Blog — Using UUIDs in prompts is bad (Oct 2025): Benchmarked Claude Haiku on aggregation tasks with 200 items across 100 class IDs. The UUID approach produced 29–68 errors per run; the integer-remapped approach produced only 5–7 errors. Opus 4 reached 100% accuracy with integer IDs but only 80% with UUIDs.
  • Nikhil Verma — LLMs as Unreliable Narrators: Dealing with UUID Hallucination (Nov 2025): Recommends using human-readable identifiers like titles instead of UUIDs in enums, noting that "UUIDs are harder for models to reproduce accurately because they're random. Titles are memorable patterns." Also available on DEV Community.
  • Floris Fok — The Hidden Cost of UUIDs in AI Prompts (Prosus AI Tech Blog): Shows that a single UUID consumes ~23 tokens due to tokenizer inefficiency with hex strings and hyphens, and demonstrates up to 95.6% token reduction by replacing UUIDs with compact identifiers.

The common recommendation across all three sources is the same: if you can avoid giving the model a UUID, do so. id-phrase lets you generate identifiers that are unique enough for practical use and reliable enough for LLMs to work with accurately.

Templates

generate_id_phrase() accepts a template string (or a list of templates to pick from at random). Each letter in the template expands to a random component; any other character is kept verbatim as a separator.

Letter Expands to
A a random adjective (e.g. corporate)
V a random verb — present participle (e.g. asking)
N a random noun (e.g. poetry)
B a random byte as two hex digits (00ff)

So A-A-N-BB expands to {adjective}-{adjective}-{noun}-{hex4} — e.g. sharp-spirited-existence-052e. Use multiple Bs back-to-back to widen the hex suffix (BB = 4 hex chars / 2 bytes, BBB = 6, etc.).

The default is ["A-A-N-BB", "A-V-N-BB"]: one of the two is chosen uniformly at random per call, which roughly doubles the phrase space at no readability cost. The hex suffix adds uniqueness guarantees while keeping the human-readable words as the primary identifier the LLM interacts with.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

id_phrase-0.0.2.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

id_phrase-0.0.2-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file id_phrase-0.0.2.tar.gz.

File metadata

  • Download URL: id_phrase-0.0.2.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for id_phrase-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e1c8134b6cee99b65359d966b2e7ec71d8ac32685447342bc296a145998c0aca
MD5 2cd7df7527c7c037bed7a2f239960b1c
BLAKE2b-256 965042c84602f633ecf71c1a1fdc52dd348a341e55f213f00dfba7598934fa0a

See more details on using hashes here.

File details

Details for the file id_phrase-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: id_phrase-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for id_phrase-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 98882f34a105c415077fd941999404155247402ce08b92d73eca3cf89096d5c5
MD5 23f1d730f37c4b3b319ff03b7afb17da
BLAKE2b-256 67dd4e51635009ce2b18792180f3ac5c98450e1778ea2225c6bd47a5e64c76f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page