Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Wherewolf

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Wherewolf(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, and third-party integrations.

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
corrupted = gaggle("We Await Silent Tristero's Empire.")
print(corrupted)

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --input-file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

# Emit an Attack summary with metrics and counts.
glitchlings --attack --sample

# Emit a full Attack report with tokens, token IDs, and metrics.
glitchlings --report --sample

Configuration Files

Configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility. It accepts plain list[str] batches, renders quick summary() reports, and can compare multiple tokenizers via Attack.compare(...) when you need a metrics matrix.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCArtifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Wherewolf

Did you hear what I heard?

Echo Chamber. Wherewolf swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.10.2.tar.gz (278.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.10.2-cp313-cp313-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.10.2-cp313-cp313-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.10.2-cp313-cp313-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.10.2-cp312-cp312-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.10.2-cp312-cp312-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.10.2-cp312-cp312-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.10.2-cp311-cp311-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.10.2-cp311-cp311-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.10.2-cp311-cp311-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.10.2-cp310-cp310-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.10.2-cp310-cp310-manylinux_2_28_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.10.2-cp310-cp310-macosx_11_0_universal2.whl (1.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.10.2.tar.gz.

File metadata

  • Download URL: glitchlings-0.10.2.tar.gz
  • Upload date:
  • Size: 278.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.10.2.tar.gz
Algorithm Hash digest
SHA256 f2bac5e48d31f45c7a0d9af2864d0c74c579041cdd219286e40214a8afe23695
MD5 e90fe0b200149e3ab21a089fc3246d5d
BLAKE2b-256 4f54c36c89397f566feef6661df672137d07a408f8ac3aea1872a4255037a2c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 10df4899323bbcbb0e43953f6c23ea1ba264c01940f57ba090bf8088c3a4cbd6
MD5 a77e93853b8ff92a59c91a3d30878545
BLAKE2b-256 010db982e0250f82ae0dd8191313e24fc12ab8691ddcb61872003e8be873a28b

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 235dccfbe7271e4422e8541a2b482c8974305456819af95c0bc904597d77eec9
MD5 c396331236e1c20e750909f699f704cb
BLAKE2b-256 164727a50db18ba7c425752b8472b5d3b565f327fdd2d03e77bf1ebb2f2f95a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 ca42c36052289e424a1edead9668309b60e7bd813e603da928f48dea9da23bb0
MD5 510065e2e59fe0e4b4eb5b1f013252ac
BLAKE2b-256 9f7d40c36dba32e8d7b80c2cf524dfe45d5bb278e23420381b0045bc4639629d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5e7bbc694b9c4eac9dd38fb321144ddabf3f39eef5d469e06ece3ad5efeebe91
MD5 32c4a00e0f870e103ece66505e445fd7
BLAKE2b-256 494c5dbabeff088772bc3ca322b35e385907806f21129cc282547168189892c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 544cf7b1544672b208ee447b61a5ffe194a97ab5be3ee6abf35be6ba71924a0a
MD5 b6732aea104844ca32b51d538f17bc3f
BLAKE2b-256 1e78720a9c7601c627068fd9aeefa261379f9d65b5e9035f82d3619fb6a669d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 4863c0799293c79f45aa489a58020834f453f80ddd5e272d4363af108ce4cc00
MD5 72efe3f60edba40603dbcd327f87fbbc
BLAKE2b-256 c5cb54eb9e0e498d108d948306ce6f4f3193c47bd4f37cbe76ba6ba7b03f03d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 670c711b4de902c18a8c23506319d3b25c181a202c0d050f2c1c13bcac3ad0a5
MD5 d736362985266c02f249ba343c45e08c
BLAKE2b-256 36b75244e057dace93ec57bb6cf80bbb054d8f48ab18224b6807507aee8087c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6709593893a895afc4ef59038b661cce7ad6354a6497937aa8ef2168544b00c7
MD5 44395227c364f31b8d3416be5a85574d
BLAKE2b-256 890670da96fcfcdc0529949c5c8b67d060c98dda07bac02c9b75ec46cd5f830a

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 72ecb3ad30abc7bb8415bc14a50881102a64258633ba5601f19b80b56dfeb19c
MD5 38a99be1d22427bac1be8ec41e94c4dd
BLAKE2b-256 bad49dc7fa94e5d023418657575df37e9b3caa63e74fbed0ace866b7cbc93766

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6f7c7ec9904dd478f6c1c5d85becc98fb404b796d2730556ccbbf54291520faa
MD5 3404d410ea2dc9871072427538eecd60
BLAKE2b-256 919f639483045ee421c420289d7102376377f944c5668b6935fa3d0a1ce1983a

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 57985a4341fdeafc724828ae2ff0eff558efdb12ca9d74e1e6be9477f6e7b10b
MD5 9c3ae5f53a7993f6fd51ae011c341873
BLAKE2b-256 d31a2bb19eb1d8568469a4157c5bae475b9342981a679e3b5342ac84a63eebe7

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.10.2-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.10.2-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 e18451692deed40b08fb69fad24613134f0e8d689bc130c32b81ba649ce3cde5
MD5 ce27e96ad8913bb9b5a99af18be95dae
BLAKE2b-256 aa82796184021aad61480b1989a95fcc00ba90299929e620892d35464e23a2d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.10.2-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page