Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Ekkokin

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Ekkokin(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, and third-party integrations.

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
corrupted = gaggle("We Await Silent Tristero's Empire.")
print(corrupted)

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

# Emit a structured Attack report with tokens, token IDs, and metrics.
glitchlings --report json --sample

Configuration Files

Configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility. It accepts plain list[str] batches, renders quick summary() reports, and can compare multiple tokenizers via Attack.compare(...) when you need a metrics matrix.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCArtifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Ekkokin

Did you hear what I heard?

Echo Chamber. Ekkokin swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.9.4.tar.gz (260.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.9.4-cp313-cp313-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.9.4-cp313-cp313-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.9.4-cp313-cp313-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.4-cp312-cp312-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.9.4-cp312-cp312-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.9.4-cp312-cp312-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.4-cp311-cp311-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.9.4-cp311-cp311-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.9.4-cp311-cp311-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.4-cp310-cp310-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.9.4-cp310-cp310-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.9.4-cp310-cp310-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.9.4.tar.gz.

File metadata

  • Download URL: glitchlings-0.9.4.tar.gz
  • Upload date:
  • Size: 260.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.9.4.tar.gz
Algorithm Hash digest
SHA256 7421055e13169a20f95c6cdfc9a3bf286a0fc1990d12b3009bf10c689d83eb54
MD5 83f210ecdbde71a6789b18dd879c7c34
BLAKE2b-256 49d99246e8097cb472b9a29e28bfab9703f20edbd70313fdd2e2107bb768caad

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 4842b9bbf21c9ce5356874bb7aad24e35317fd0d7a9757b74bd31ed56895831d
MD5 851b46d841fdefa9857a3ff65dadc8ec
BLAKE2b-256 d8f10dcff4e1cc111e07731a2fb965d4333f0893ef30b3a0fa12f90775f077ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 88bec898a4b6aa98cb98104e97dee8fd9c9e6b0f14b5fdc53db1377ccf60bbb4
MD5 a27bd3ddf5c4939f8d54b5078cbb65ae
BLAKE2b-256 9bf59d3b0ff799d580b1f4f65188049a57d0a681524226d3aaf0198fa6d933ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 3cc6568be83e447ddcd93f5157fe1f487b399eaef221740507e84102afe870ba
MD5 51bfff2533ff3be1e4d05f8e8177f2a9
BLAKE2b-256 b1116c29135943defb9e1543dc151abbdae89c40967cf2bf2443d57e3988c7ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 58d7cec40b3b376106c874869524805c8202a6664c0e2d25c69a59b68c0d171b
MD5 bd04e41ac90f6926b4d4201d080d0492
BLAKE2b-256 af72ecccf0b72851c216943f3f4451909162876747d5f47f379223d79138a8d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f6f744ac1bae3fffb2edec371a671410960245d52855f23fe1dd85c42c399bca
MD5 c8a0ce8b7c21e5eabd0489616387b347
BLAKE2b-256 9c1996fa6783c1bc620ead6450d28ae01838ba7e10961a311983157e849e5ef4

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 b6337c8edcf8d060c69bc7ef7eac3d5ff9c68343b085ab3eb05e7fdb7d0171a2
MD5 c7db0f5b5da23255af3f908ef1b8c199
BLAKE2b-256 8a685df0567d50e6d7637878addfd1637270de8fa02e8392a32d4930049b991e

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 64c0e6e5b0ca4c1bd8a0f752b4ce985e54afb0d236be3ef73a741bc89ce0e707
MD5 ca9e7c4c952aa3a1773ee8be18625b65
BLAKE2b-256 1f4c6c4a839a28b6f3787c6fb5e922d591876121485c860d21bab89614c2f6c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a1424c6564c8f0c5b2af11a4eff0f26a6a241e2a294dbd8734d7a3b103d82318
MD5 baa260965b642da8d85e03ba0f7c59ec
BLAKE2b-256 f24785aaf8d881f283b05826bda59a335b5345e0f0e9df5b482b29a77ee1ca2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 2fa1b0e198b29c09bc05bed746d2d1b00129eb04ce680b1e2141e92a9fed31f8
MD5 0c131729087c98cb7d873dae48ed0b38
BLAKE2b-256 de4715f8e1cd803823b9960260e520c03915b3c8cf88026667ed17d8a0ebd8fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a1fbcb901af9bce88b42e80768aabbd7d1d9f99b5efe6025aee9d68a60a3efaf
MD5 5a432f66a41e053bb8fd30c6a8702a38
BLAKE2b-256 642c69a2e10d95f905610794398b348fd7cbaf333d6684d3345e3e732bef2c1b

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 664c3ccb33b547e5ff6e3e1959ecb7f0358ef9ec6bcfb7d52e1e7dbe8af02e61
MD5 420d6d831cb5652c039631475ae2f4a2
BLAKE2b-256 f565f0e5e2936cb45216da448b676eb8adcba1a4a7e0c5fa16b842d0b5e10930

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.4-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.4-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 72bd7df4b601755274b68b6de424d3a4b95a99ae763141efbe214cdf8117a938
MD5 f248d3dc8d53c00228ae19a73b982284
BLAKE2b-256 fbfaa8472019d7e7c3d01290022994d44cda1cc8dde46a3c32845ee5f1b4dc11

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.4-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page