Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Ekkokin

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Ekkokin(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, and third-party integrations.

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
corrupted = gaggle("We Await Silent Tristero's Empire.")
print(corrupted)

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

# Emit a structured Attack report with tokens, token IDs, and metrics.
glitchlings --report json --sample

Configuration Files

Configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility. It accepts plain list[str] batches, renders quick summary() reports, and can compare multiple tokenizers via Attack.compare(...) when you need a metrics matrix.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCArtifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Ekkokin

Did you hear what I heard?

Echo Chamber. Ekkokin swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.9.5.tar.gz (277.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.9.5-cp313-cp313-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.9.5-cp313-cp313-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.9.5-cp313-cp313-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.5-cp312-cp312-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.9.5-cp312-cp312-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.9.5-cp312-cp312-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.5-cp311-cp311-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.9.5-cp311-cp311-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.5-cp310-cp310-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.9.5-cp310-cp310-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.9.5.tar.gz.

File metadata

  • Download URL: glitchlings-0.9.5.tar.gz
  • Upload date:
  • Size: 277.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.9.5.tar.gz
Algorithm Hash digest
SHA256 ed70f94a897fc82d05b5c24fa2fd58a098e552e351f99b44fa9c4eb9591bd381
MD5 92e5cc33282af3470ddde1637b9a4300
BLAKE2b-256 551bed26102bcf18df720e5d09106631d955ac520e26f25ee5b60b31833a1d60

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 d3565563a78fc01531f5390762e6b55976734e0d4235d4d0d1538d54e87a0a88
MD5 98a4e228cd0b8c3518810ec8d5ed3902
BLAKE2b-256 c72aafd66fd9e8a15564516575d4751462ffbc7d29cc06271c117001f85e1b23

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b8a385a1e59ccdc8e7d9021dad571dbf35453717f9fb77e64110ba7047be5c42
MD5 2435dbf14893b96380cd8485fdaf3273
BLAKE2b-256 d70c99e031f67dfda228526345b07516a14f14b69283910deed77dea0b5df65d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 381240362d25639d3a01a83ebee17c37f388c8245c80f106f178d2ed495f5fbf
MD5 8d329bf80a9b32da29195107bf85be76
BLAKE2b-256 7f54d8051b19cfa78e884d1eccb87a60a4f45412437ba8d82338d2e73e26a6d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 564d20a869ceb4219bc6a921f2c514f03b3fa8a9a2e4b259b96b12947a5324df
MD5 d94e5055d2a3824962d6d2be16f575a6
BLAKE2b-256 a48d68046ad9535299a727a2b161e764af304beef2ed25f35da4d797fbe360b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 082fa3c7e2ebb0dfc8cfdfbf02c09541725a8287fb10a037ca303155010f3a4f
MD5 273928018c0bb29785d0a80a21afeabc
BLAKE2b-256 c405071ff3449af8084275a8466b7b06723e85391067932f8641e5548927c729

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 9668656fcf5aed287d4ddcfd60251b5e706d89237d2d2068c2a8abd02144dac9
MD5 da41b2673e33dcaa0c0efec0988232f3
BLAKE2b-256 b30cff6af8e6b3e76a37716daa09ed3771470e22abb7649ad4c8ec9cb36fa53d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 40abf4f86114e686678ef6aacc7455176395c5be6cc20c1ceaf21acb871e1897
MD5 84acfb0b7745d77b5cbbce6fde62083f
BLAKE2b-256 d8e3315618d643ef90e6fdd9762018407e841f5eed51629946fef26f9647133c

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 67cda725cc492c8be597df0ab556a9b057ee716c765ee6b4317491772e16a8fb
MD5 95337e36fe59e9c7c3a4dd4851ddeb6f
BLAKE2b-256 98c23aebd757e89a2b819349c8e067a09ab3beb3cb85f8b2fdc998e20735caab

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 dd810faf563bdee499f989883ec7919b54bdbd3f24b76720a03d28ac33e5b866
MD5 d232a1aa4162855edb84e8999520b63e
BLAKE2b-256 cd6ac3f55af2e181af4919926a87d91e6d832f015c30adf0baa66784eddf2e4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 10b03aad202dac7dbaa16965a4076cccbc2d53296d3ae512b2482245a0a21225
MD5 1aada65f37c7b388bd28bc96a0d3db23
BLAKE2b-256 d0c0318d11184d84bc05dc9f3c3488cb519d69b3d2cdae7bea7702f1bb6d3752

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 37561183e54683c2299e2ba1bd624b38d1dff5eac26b5baa9df2c22111d7406e
MD5 093457e376e823c6f526dbb4848575ab
BLAKE2b-256 4dae496731bbf420447987c28654020ff33b54dc4966d2b620747df10d66e8ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.5-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.5-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 f8132a0cf5310b965cdf0c1505af043566f5543f817d85fd8f54da7f66ad53cc
MD5 03476042b6ad4ab485ef2de1a6db74f0
BLAKE2b-256 3e0195579c2073484f6af1ff49a254dd931fffa966beabac4150120803670a50

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.5-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page