Skip to main content

Monsters for your language games.

Project description

     .─') _                                       .─') _                  
    (  OO) )                                     ( OO ) )            
  ░██████  ░██ ░██   ░██               ░██        ░██ ░██                                 
 ░██   ░██ ░██       ░██                ░██        ░██                                     
░██        ░██ ░██░████████  ░███████   ░████████  ░██ ░██░████████   ░████████ ░███████  
░██  █████ ░██ ░██   ░██    ░██('─.░██ ░██    ░██ ░██ ░██░██    ░██ ░██.─')░██ ░██        
░██     ██ ░██ ░██   ░██    ░██( OO ) ╱░██    ░██ ░██ ░██░██    ░██ ░██(OO)░██ ░███████  
  ░██  ░███ ░██ ░██   ░██    ░██    ░██ ░██    ░██ ░██ ░██░██    ░██ ░██ o ░███      ░██ 
  ░█████░█ ░██ ░██   ░████   ░███████  ░██    ░██ ░██ ░██░██    ░██  ░█████░██ ░███████  
                                                                          ░██            
                                                                  ░███████             

                        Every language game breeds monsters.

Python Versions PyPI version Wheel Linting and Typing
Entropy Budget Chaos Charm
Lore Compliance

Glitchlings are utilities for corrupting the text inputs to your language models in deterministic, linguistically principled ways.
Each embodies a different way that documents can be compromised in the wild.

If reinforcement learning environments are games, then Glitchlings are enemies to breathe new life into old challenges.

They do this by breaking surface patterns in the input while keeping the target output intact.

Some Glitchlings are petty nuisances. Some Glitchlings are eldritch horrors.
Together, they create truly nightmarish scenarios for your language models.

After all, what good is general intelligence if it can't handle a little chaos?

-The Curator

Motivation

If your model performs well on a particular task, but not when Glitchlings are present, it's a sign that it hasn't actually generalized to the problem.

Conversely, training a model to perform well in the presence of the types of perturbations introduced by Glitchlings should help it generalize better.

Quickstart

pip install -U glitchlings

The fastest way to get started is to ask my assistant, Auggie, to prepare a custom mix of glitchlings for you:

from glitchlings import Auggie, SAMPLE_TEXT

auggie = (
    Auggie(seed=404)
    .typo(rate=0.015)
    .confusable(rate=0.01)
    .homophone(rate=0.02)
)

print(auggie(SAMPLE_TEXT))

One morning, when Gregor Samsa woke from troubld dreams, he found himself transformed in his bed into a horible vermin. He layed on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections. The bedding was hardly able to cover it and seemed ready to slide off any moment. His many legs, pitifully thin compared with the size of the rest of him, waved about helplessly as he looked.

You're more than welcome to summon them directly, if you're feeling brave:

from glitchlings import Gaggle, SAMPLE_TEXT, Typogre, Mim1c, Ekkokin

gaggle = Gaggle(
    [
        Typogre(rate=0.015),
        Mim1c(rate=0.01),
        Ekkokin(rate=0.02),
    ],
    seed=404
)

Consult the Glitchlings Usage Guide for end-to-end instructions spanning the Python API, CLI, HuggingFace, PyTorch, and Prime Intellect integrations, and the compiled Rust pipeline (always enabled).

Your First Battle

Summon your chosen Glitchling (or a few, if ya nasty) and call it on your text or slot it into Dataset.map(...), supplying a seed if desired. Glitchlings are standard Python classes, so you can instantiate them with whatever parameters fit your scenario:

from glitchlings import Gaggle, Typogre, Mim1c

custom_typogre = Typogre(rate=0.1)
selective_mimic = Mim1c(rate=0.05, classes=["LATIN", "GREEK"])

gaggle = Gaggle([custom_typogre, selective_mimic], seed=99)
print(gaggle("Summoned heroes do not fear the glitch."))

Calling a Glitchling on a str transparently calls .corrupt(str, ...) -> str. This means that as long as your glitchlings get along logically, they play nicely with one another.

When summoned as or gathered into a Gaggle, the Glitchlings will automatically order themselves into attack waves, based on the scope of the change they make:

  1. Document
  2. Paragraph
  3. Sentence
  4. Word
  5. Character

They're horrible little gremlins, but they're not unreasonable.

Command-Line Interface (CLI)

Keyboard warriors can challenge them directly via the glitchlings command (see the generated CLI reference in docs/cli.md for the full contract):

# Discover which glitchlings are currently on the loose.
glitchlings --list
 
# Review the full CLI contract.
glitchlings --help
 
# Run Typogre against the contents of a file and inspect the diff.
glitchlings -g typogre --file documents/report.txt --diff

# Configure glitchlings inline by passing keyword arguments.
glitchlings -g "Typogre(rate=0.05)" "Ghouls just wanna have fun"

# Pipe text straight into the CLI for an on-the-fly corruption.
echo "Beware LLM-written flavor-text" | glitchlings -g mim1c

Attack Configurations

Attack configurations live in plain YAML files so you can version-control experiments without touching code:

# Load a roster from a YAML attack configuration.
glitchlings --config experiments/chaos.yaml "Let slips the glitchlings of war"
# experiments/chaos.yaml
seed: 31337
glitchlings:
  - name: Typogre
    rate: 0.04
  - "Rushmore(rate=0.12, unweighted=True)"
  - name: Zeedub
    parameters:
      rate: 0.02
      characters: ["\u200b", "\u2060"]

Attack on Token

Looking to compare before/after corruption with metrics and stable seeds? Reach for the Attack helper, which bundles tokenization, metrics, and transcript batching into a single utility.

Development

Follow the development setup guide for editable installs, automated tests, and tips on enabling the Rust pipeline while you hack on new glitchlings.

Starter 'lings

For maintainability reasons, all Glitchling have consented to be given nicknames once they're in your care. See the Monster Manual for a complete bestiary.

Typogre

What a nice word, would be a shame if something happened to it.

Fatfinger. Typogre introduces character-level errors (duplicating, dropping, adding, or swapping) based on the layout of a keyboard (QWERTY by default, with Dvorak and Colemak variants built-in).

Mim1c

Wait, was that...?

Confusion. Mim1c replaces non-space characters with Unicode Confusables, characters that are distinct but would not usually confuse a human reader.

Hokey

She's soooooo coooool!

Passionista. Hokey gets a little excited and streeeeetches words for emphasis.

Apocryphal Glitchling contributed by Chloé Nunes

Scannequin

How can a computer need reading glasses?

OCR Artifacts. Scannequin mimics optical character recognition errors by swapping visually similar character sequences (like rn↔m, cl↔d, O↔0, l/I/1).

Zeedub

Watch your step around here.

Invisible Ink. Zeedub slips zero-width codepoints between non-space character pairs, forcing models to reason about text whose visible form masks hidden glyphs.

Ekkokin

Did you hear what I heard?

Echo Chamber. Ekkokin swaps words with curated homophones so the text still sounds right while the spelling drifts. Groups are normalised to prevent duplicates and casing is preserved when substitutions fire.

Jargoyle

Uh oh. The worst person you know just bought a thesaurus.

Sesquipedalianism. Jargoyle insufferably replaces words from selected parts of speech with synonyms at random, without regard for connotational or denotational differences.

Rushmore

I accidentally an entire word.

Tactical Scrambler. Rushmore randomly drops, duplicates, or swaps words in the text to simulate hasty writing, editing mistakes, or transmission errors.

Redactyl

Oops, that was my black highlighter.

FOIA Reply. Redactyl obscures random words in your document like an NSA analyst with a bad sense of humor.

Apocrypha

Cave paintings and oral tradition contain many depictions of strange, otherworldly Glitchlings.
These Apocryphal Glitchling are said to possess unique abilities or behaviors.
If you encounter one of these elusive beings, please document your findings and share them with The Curator.

Ensuring Reproducible Corruption

Every Glitchling should own its own independent random.Random instance. That means:

  • No random.seed(...) calls touch Python's global RNG.
  • Supplying a seed when you construct a Glitchling (or when you summon(...)) makes its behavior reproducible.
  • Re-running a Gaggle with the same master seed and the same input text (and same external data!) yields identical corruption output.
  • Corruption functions are written to accept an rng parameter internally so that all randomness is centralized and testable.

At Wits' End?

If you're trying to add a new glitchling and can't seem to make it deterministic, here are some places to look for determinism-breaking code:

  1. Search for any direct calls to random.choice, random.shuffle, or set(...) ordering without going through the provided rng.
  2. Ensure you sort collections before shuffling or sampling.
  3. Make sure indices are chosen from a stable reference (e.g., original text) when applying length‑changing edits.
  4. Make sure there are enough sort keys to maintain stability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glitchlings-0.8.1.tar.gz (323.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

glitchlings-0.8.1-cp313-cp313-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.13Windows x86-64

glitchlings-0.8.1-cp313-cp313-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

glitchlings-0.8.1-cp313-cp313-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.1-cp312-cp312-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.12Windows x86-64

glitchlings-0.8.1-cp312-cp312-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

glitchlings-0.8.1-cp312-cp312-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.1-cp311-cp311-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.11Windows x86-64

glitchlings-0.8.1-cp311-cp311-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

glitchlings-0.8.1-cp311-cp311-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ universal2 (ARM64, x86-64)

glitchlings-0.8.1-cp310-cp310-win_amd64.whl (1.3 MB view details)

Uploaded CPython 3.10Windows x86-64

glitchlings-0.8.1-cp310-cp310-manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

glitchlings-0.8.1-cp310-cp310-macosx_11_0_universal2.whl (1.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file glitchlings-0.8.1.tar.gz.

File metadata

  • Download URL: glitchlings-0.8.1.tar.gz
  • Upload date:
  • Size: 323.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glitchlings-0.8.1.tar.gz
Algorithm Hash digest
SHA256 ec841762f8ca08f73d8ef7ea6d05d2cf0e7edcfd7cd0c0c51b6198470e300d38
MD5 a5eb3e5d712d4c9d4a686850bcb6efe6
BLAKE2b-256 54e7997bbf0a70734134c3307048aa4de6b8e763c6a1e414f42802e13d825db3

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1.tar.gz:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ee399d45beea46f0d03d9db3635789139f4e6265a7a5014b49a5a0a9656ce2b9
MD5 9daf82677836dfe212c452832d33593e
BLAKE2b-256 1acab93ca45bf523c20dde80bdb385665c066c0da8e6338f49d9e969e4b2ea43

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 55eec952cee3a81c34fc3e061444b0d66e24b203f8c35d089c2696f51503ec34
MD5 af3fc95376d4e3d4edd462ea52122fbb
BLAKE2b-256 30909535b1658060be6a49af694d25f4d1e634acac18e9e30bd2f94baf958c29

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp313-cp313-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp313-cp313-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 2612064330e58cdee34aee2e8d89b0946e922ffcb6abc89763952d91419eeafd
MD5 b6f12243a45c0949cca47ddda032b4c2
BLAKE2b-256 003d498f857bf10d3f289ccb56ff3cc7a29d69004283fc37f23f9dd8c906402d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp313-cp313-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4754b28abe3d708f3dfdf8ca9a6d0d0cca66964a8270dc4012956f1740c25e67
MD5 598a3f8a9054aa90454a0f30dc72910a
BLAKE2b-256 ee3a00f08b86eb2928c1d267eb47d668576d9a41dbf46c63a0d1f978a3669c54

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 97fe05d5041349f03a14014e31cb8a25407c7612c6f2a7b5c91b7be62f7416da
MD5 9636c7eaf78a1ad3d42e6922eec4a0cf
BLAKE2b-256 b1352a8ac01de5a957796237d30717483ec84255651bd82d3e11514222ffa169

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp312-cp312-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp312-cp312-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 259a69db6a4124c5e7b2ccf9e6289785cd6e87370baaa350831ce573a1047885
MD5 18d077e9c1c3de3e6bd0d7cab66093fd
BLAKE2b-256 0a606537eabbb8b3377c4fa8adb0d349c885e5e6dd546bde669401bd6f95dfbe

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp312-cp312-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b43b4c33b50c219301509dd1fa780dc662c1b25b26047da2a37c09c17a7e893d
MD5 f51dbd5fbb227c9d6a88c3dc2cc7124e
BLAKE2b-256 f1b6fbd06fc9ba723b05e0308428bcf7a4f205a31fee6dc3e8c7414863189dee

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d532a74ff1dce7873067937967600a6e09e3aa1d8697ded46b5c4d82d5e6d546
MD5 8a0a30287771aaaa80fdf1ee7d27feef
BLAKE2b-256 d256b7d29b4319d7528cfa8c1ad5d4a6841f6f335d5340106e3e8e3f9b398a40

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp311-cp311-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp311-cp311-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 b8d9e6241872e42b6e95815ef8e15050f0f32d3c16c3d561e204349646ea313e
MD5 5d21ee6120cefac0f5f5d4be5b980eb6
BLAKE2b-256 2cba3bd8d41e44e37be399f0641592a11877707bb5da9346657934b4b7beb289

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp311-cp311-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e68e69282765fd8d86d9e7b2cc01f4924eed19c1f15642b1f8ed07325c230d1d
MD5 1d62b1d2d14f7ac35b4530cad9104e83
BLAKE2b-256 d6d4ca88dc11c08b8c4b02d4195ee63454ee52c90160b1028223871277c1bd4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fdefea2ab57024b30a16239337473c227edc130a9543ea13af823ebc4bb4083d
MD5 ae2eb847fcbc7ce4106afed05e4f2b6a
BLAKE2b-256 512a711247458df4f98175ca58e35a64c17821febf53aca0ddaae7f5932925a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glitchlings-0.8.1-cp310-cp310-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for glitchlings-0.8.1-cp310-cp310-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 dbd8f8ab327eee2bf613e469b13a91ea87f4fbd26e949d14462c7ef89715158a
MD5 6a63bf88faa656c3f18f797b4b2f8d99
BLAKE2b-256 086cb7513343ff3cedc4363963b0c93aefe80378f6f4a0c5636b9b7d6f935c67

See more details on using hashes here.

Provenance

The following attestation bundles were made for glitchlings-0.8.1-cp310-cp310-macosx_11_0_universal2.whl:

Publisher: publish.yml on osoleve/glitchlings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page