Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Ekkokin

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Ekkokin(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, and third-party integrations.

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
corrupted = gaggle("We Await Silent Tristero's Empire.")
print(corrupted)

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

Attack Configurations

Attack configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCArtifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Ekkokin

Did you hear what I heard?

Echo Chamber. Ekkokin swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.9.0.tar.gz (316.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.9.0-cp313-cp313-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.9.0-cp313-cp313-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.9.0-cp313-cp313-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.0-cp312-cp312-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.9.0-cp312-cp312-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.0-cp311-cp311-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.9.0-cp311-cp311-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.9.0-cp311-cp311-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.9.0-cp310-cp310-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.9.0-cp310-cp310-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.9.0-cp310-cp310-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.9.0.tar.gz.

File metadata

  • Download URL: glitchlings-0.9.0.tar.gz
  • Upload date:
  • Size: 316.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.9.0.tar.gz
Algorithm Hash digest
SHA256 eb897a59d602a13fa3cb871fc63469d11bc05336a56d9da231afb756584e5a6a
MD5 962408e242b7d0765eaa219363aff7a9
BLAKE2b-256 17520666b1af8dc7567fd5f6fb5518f93db1be9faf4408571dee1ce4eccbe3e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 756e8415b63a07510cfb3889733bacee713db7959d99648dc1001be6bb7a538f
MD5 2d389f911f541d97eca9e31ec5febdb8
BLAKE2b-256 d35fcaac3f33dc97536d5d419a0da9cc74fc61df52df11f447b9a84ec0817efd

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c526a5c31527b066b7b0bddb78a692b3b2fadd0771be8f7ca3e1f5d3230908c8
MD5 1384ae580bd907841518b3f36668ccaa
BLAKE2b-256 79bc98e657b9eb2187921f08df34d82572cba9719386a97c618f10bd8de4531d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 c4cc092d2c185a5390c5df64911489acc932211890b3b581da9c2e0e0a23ca08
MD5 3534a0ee4a451cebcaf1e2fb314b07a5
BLAKE2b-256 c88bfe8c2c3eec4efe6c951d9b629a16d643837857d1b31b777dd2bf99ebb110

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c61ec0e05950c944dfb40f22dbb8b01a0c181758db3382b7212e66152e72a86e
MD5 3c3530f2b49488f726592b9b99a32131
BLAKE2b-256 40b00690d5011aa762a44369d8a9a40020285287e7b5ad0589376f579479bb26

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1d5d45faea76e56ef3bedd88fcca5409d0628fb7c3618420b2713d8d5581543c
MD5 f77fd81465a205bf580af8eb3f86739d
BLAKE2b-256 b38f8cc682350b9c8d9a7ebecf102c02004e2bef08e1ca5b207d44e7068a6e00

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 cf45f6e994aabe0d32532bacf30490aca60fbfe24d45151e8253a6070ac089c9
MD5 ffbb308dbeef9f293bcd23cdcaee6312
BLAKE2b-256 782735f0fd95bb3be77e34da5264d61d5556b8e96377e2e3235642e1fb19fb24

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2132b1c6b94e8486dbfeb7379d51be59f98314f83e6976d502ed098ca5070c96
MD5 d6fad71f0a614f0b50e57b626d0a6add
BLAKE2b-256 0f95f6d7bcd1056912488192589fe00eb039e510edfa6eb1b1b662a8b829ac93

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 54b601711fdecd3719ff92e191bb1acf8a70f39df71be54b19c31acb7a1e4185
MD5 834eb03d011627ef2667452a06030a71
BLAKE2b-256 383f6612dda9e47c0168452094c9eb887d24676a7bd09f303c2e0304ac103c1c

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 176da5fbad9f62af7b21b722d35c67e55c246d9610f914d09691acbf950748d2
MD5 af5dc225456dea6de284df479bf29b9b
BLAKE2b-256 cf67a54cca74eb6940675f5cf1fc10cb33dc2d41eac8751e66082756d749cea4

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c766fae55105f93eea556139dca3b6e316df970d8abc7e2ead478f538fec3334
MD5 1dd723172ec3bb50b666a1739f02c00e
BLAKE2b-256 5663eb2d8680a2168c5bf24e8f378826bfd7fd3c23e883165562ba4f6c689db2

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d3d25a490eeee8b75fb05ff54601bd4edcda9bebc7e07fc6c0e7ff43a2d0af87
MD5 f9505aa64461f8568b19ef774243d9ba
BLAKE2b-256 293a252d5b82a152b0db0b9146ec643da19f414292f1bb3b2f821ae6ea684979

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.9.0-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.9.0-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 9d9bca67fea23c358ab8c739745340c0d106f70dd91006f4502a7b8f675d921b
MD5 5ba0c634d35635406e1b1d3594131dda
BLAKE2b-256 f790b100c1ebd8bfcedbe68085b9063fe8e33b8af9887463e4e1865f94948d02

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.9.0-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page