Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Ekkokin

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Ekkokin(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, HuggingFace, PyTorch, and Prime Intellect integrations, and the compiled Rust pipeline (always enabled).

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes, so you can instantiate them with whatever parameters fit your scenario:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
print(gaggle("Summoned heroes do not fear the glitch."))

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

Attack Configurations

Attack configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCR Artifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Ekkokin

Did you hear what I heard?

Echo Chamber. Ekkokin swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words from selected parts of speech with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.8.0.tar.gz (327.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.8.0-cp313-cp313-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.8.0-cp313-cp313-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.8.0-cp313-cp313-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.0-cp312-cp312-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.8.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.8.0-cp312-cp312-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.0-cp311-cp311-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.8.0-cp311-cp311-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.8.0-cp311-cp311-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.0-cp310-cp310-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.8.0-cp310-cp310-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.8.0-cp310-cp310-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.8.0.tar.gz.

File metadata

  • Download URL: glitchlings-0.8.0.tar.gz
  • Upload date:
  • Size: 327.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.8.0.tar.gz
Algorithm Hash digest
SHA256 a03d29d8b53a7b0a7872dbbbffa550a9adfe90ba0bc8fa3d95fbeb696608f9f2
MD5 c1bae2b741767bec8fde7c0116ec6c70
BLAKE2b-256 26707fbcd3c525f1c785040d683e5cd4e6aa79530f3f20532aced8305bbae888

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 d0cb1d50ff3193b9c2fa0c72549ed259e455406430a7c8f7b21cc5f5a5d39527
MD5 d228a7f09f74e5b13b13e7f08f9c3b4a
BLAKE2b-256 26a2d787cfbe224ae5c0aa69393eefd309f7c46869662e7b742093a38ee15a5f

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e2cfdb3203461357453a7a1bd778c5fc58d6d77eb71aac4106cc0a6d2a8bba1f
MD5 0909cb0524cba2b2e3569c407ec1862f
BLAKE2b-256 48134e24f7b1bfa98897bbaa0f0b48496c7e029ba0a58630cc56d57f5e261fe3

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 544aac2e57b126ba2e602ed4a0b638d0e7d6e98314f5127634466fd342fbd4b8
MD5 ee2857b65df4635666cdd638d3dd882a
BLAKE2b-256 ba36e8c5ea8ef654b024a8a15635724efb6eb526660b607d716227571cf41879

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9120ef87b96cf6d7adde7486fa845c9dee2941e18a3acda75a4dd14950e15200
MD5 a765350d77a0ff6b93223c7a54e120c4
BLAKE2b-256 30ccb75e182e09b22bb5ab5fefe13210b436c60bffae47aa3f562377ed405516

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2880eef1f11c7e37c281f80c8764e61c68c89a22c27b117a9cd6fb5e9c852d3e
MD5 481ecb1e2efa7ff6d1d285821300073d
BLAKE2b-256 7b86241b752172535c4e1435d06db7b1674209ec5c58e1afbe90bddb11a39d3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 d8e337f55255e076134905d7f3706861bb274d6c71a10a41654b8c3d1dff2311
MD5 1ad1c3ca938f678b276370d6b3843e92
BLAKE2b-256 91d8caa35780bc4d5595657918ea30272e59d85ea16874a002fee98af797df0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 4f3ece254de7ae9439069ec831c1d2c706fc637a8a5c8a0749313dd9233346f4
MD5 3ac4957746630a7091b7f4842543b2e9
BLAKE2b-256 33f59635e36da2fe08e3ee4ca6d6686ff96abea3926108ccac76bddd253333ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b90888cd97f57dfc2baa16d59f6d063822ddc6e84fb9b7b6044fb720aebe61fe
MD5 84282d38465050bc2f80188daf612c5d
BLAKE2b-256 1a8beac6065424c195a627ab4f938435f46c3b01926717d26d31ea22a71df827

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 2c49d9915ca309e1818c0100f89d886d1a713bc75beed6f904cf5a729c6ac280
MD5 be17f30eeb3a6f6dfcdf3ddc8eb65c9d
BLAKE2b-256 95ec6590c78bcc58727066e1885628e5df3f1237000c1ce82bef5cb438b52c2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 116664b6275b0af7917179a62fdae8bbb8683575e13053f56b359c1594b32d3b
MD5 08c303efc85458d05bfb1bffecb6d61e
BLAKE2b-256 a7b619d590caea575d12cecef80f7930acdf0a0435eda2a81d18ec772fdd4847

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 574f93e8e1398ca5bc14227c5255c66e8c0dfffdc0d158f502aefd4ac196dfac
MD5 b76cb7bdc4fa127e212857b4c5d9c343
BLAKE2b-256 414465b147a1a583140bfa0b0dcd5a857c11059db6852d286f676f6b144734cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.0-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.0-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 3d614ebc580bca833d52e4c18fe4fea9b29df5f9c66051b6908d4d39feaaf5b3
MD5 57076baec95400a3b574d677ebef4e9e
BLAKE2b-256 affcad5cb9f553db0a621c19fa1aa96fa042628fdec665d8a7294a864a84487c

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.0-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page