Read, annotate, train and decrypt text captchas with a CRNN+CTC model.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jtrecenti

These details have not been verified by PyPI

Project links

Model (HF Hub)

Project description

txtcaptcha

Read, annotate, train and decrypt text captchas in images with a modern CRNN + CTC pipeline in PyTorch.

txtcaptcha ships:

a CRNN architecture that handles arbitrary input sizes and variable-length labels,
the full alphanumeric vocabulary 0-9a-zA-Z (62 classes + CTC blank),
decode-time masking so a single trained model can be restricted per site (e.g. mask="[0-9]"),
fixed-length decoding via length=N for sites with a known length,
a pretrained unified model hosted on the Hugging Face Hub with ~89% captcha-level accuracy across ten Brazilian court captcha datasets.

Installation

pip install txtcaptcha

Or from source with uv:

git clone https://github.com/jtrecenti/txtcaptcha
cd txtcaptcha
uv sync --extra dev

Quick start

The first decrypt call downloads the pretrained model from the Hugging Face Hub into ~/.cache/huggingface/hub; subsequent calls are free.

from txtcaptcha import read_captcha, decrypt

cap = read_captcha("path/to/captcha.png")
print(decrypt(cap))                          # greedy, variable length
print(decrypt(cap, mask="[0-9]"))            # digits only
print(decrypt(cap, length=5))                # force exactly 5 chars
print(decrypt(cap, mask=list("abcdef0123"))) # explicit allowed set

Pin a specific release or load a different Hub repo explicitly:

from txtcaptcha import from_pretrained

model = from_pretrained("jtrecenti/txtcaptcha-crnn", revision="v0.1.0")
print(decrypt(cap, model=model))

Training your own model

from txtcaptcha import fit_model, save_model, download_dataset

data_dir = download_dataset("tjmg", "data")
model, history = fit_model(
    data_dir,
    epochs=30,
    batch_size=64,
    case_sensitive=False,
)
save_model(model, "tjmg.pt")

Publishing your own model to the Hub

from txtcaptcha import push_to_hub

push_to_hub(
    model,
    repo_id="your-username/your-captcha-model",
    model_card="# My captcha model\n\nTrained on ...",
    tag="v0.1.0",
)

Public API

Function	Purpose
`read_captcha(files, lab_in_path=False)`	Load image(s) into a `Captcha` object.
`Captcha`	Container with `images`, `labels`, `paths`, `plot()`.
`annotate(files, labels=None, ...)`	Interactive/batch labeling (filename convention).
`CaptchaDataset(root, vocab, height, case_sensitive)`	PyTorch dataset over a folder of `<id>_<label>.<ext>` files.
`transform_image(files, height=32)`	Load + resize + width-pad for batching.
`encode_label`, `decode_indices`	Vocab ↔ tensor (CTC blank index 0).
`pad_collate`	DataLoader collate fn for variable-width batching.
`CRNN(vocab, ...)`	CNN + BiLSTM + linear head.
`fit_model(dir, ...)`	Training loop with CTC loss + early stopping.
`decrypt(files, model=None, mask=None, case_sensitive=True, length=None)`	Predict labels; auto-downloads the pretrained model when `model=None`.
`save_model`, `load_model`	Local checkpoint persistence.
`from_pretrained`, `save_pretrained`, `push_to_hub`	Hugging Face Hub integration.
`download_dataset`, `available_datasets`	Fetch labeled training datasets.
`download_captchas` (CLI)	Download live, unlabeled captchas from 10 Brazilian sources.
`sequence_accuracy(preds, targets)`	Exact-match accuracy metric.

Full API reference: https://jtrecenti.github.io/txtcaptcha/.

Architecture

CRNN is a Convolutional Recurrent Neural Network:

CNN backbone — ResNet-style basic blocks (64 → 128 → 256 → 256 channels) with strided pooling. Down-samples height by 8 and width by 4, preserving width resolution for the sequence dimension.
Adaptive pool — collapses the remaining height to 1, producing a width-indexed sequence of feature vectors.
BiLSTM — 2-layer bidirectional LSTM (hidden 256).
Linear head — projects to len(vocab) + 1 logits per timestep (the extra slot is the CTC blank).
CTC loss — handles variable-length targets, no per-position softmax.

Variable image dimensions are handled by resizing height to 32 at load time, preserving the aspect-ratio width, and padding widths within each batch via pad_collate. CRNN+CTC is the de-facto baseline for short-text scene-text recognition — lighter than transformer OCR (e.g. TrOCR) and consistently strong on short captcha images.

Variable-length labels

CRNN + CTC handles variable label lengths natively. The convolutional stack emits T logits per image; CTC collapsing (remove consecutive repeats, then remove blanks) turns any path into a string of arbitrary length between 0 and T. Training mixes 4-char and 5-char labels in the same batch — no length head, no padding tokens.

The downside of greedy CTC is that a confident wrong timestep can yield a prediction of the wrong length. When you know the expected length, pass length= to switch to an exact dynamic-programming search over CTC paths that collapse to exactly that many characters:

decrypt(cap)                       # greedy
decrypt(cap, length=5)             # force 5 chars
decrypt(cap, length=4, mask="[0-9]")  # combine with masking

The DP runs in O(T · L · |vocab|) per image, tracks the best path for every (collapsed_count, last_index) state and reconstructs the argmax. It is strictly at least as good as greedy when the true length is known and never emits a wrong-length prediction.

Decode-time masking

decrypt(..., mask=...) zeros out forbidden vocabulary logits before CTC decoding, so the same trained model can be specialized per site:

decrypt(cap, mask=["a", "b", "c", "1", "2", "3"])  # explicit list
decrypt(cap, mask="[0-9a-z]")                       # regex char-class
decrypt(cap, mask="[A-Z]", case_sensitive=True)     # uppercase only
decrypt(cap, mask="[a-z]", case_sensitive=False)    # output lowercased

Notebooks

notebooks/train_unified_model.ipynb — downloads every dataset, merges them and trains the unified CRNN. Designed for a cloud GPU machine.
notebooks/eval_per_dataset.ipynb — per-dataset accuracy on a held-out split.
notebooks/eval_per_dataset_live.ipynb — predictions on freshly downloaded, unlabeled captchas (overfit check).

Tests

uv run pytest

License

MIT © Julio Trecenti

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jtrecenti

These details have not been verified by PyPI

Project links

Model (HF Hub)

Release history Release notifications | RSS feed

This version

0.1.0

Apr 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

txtcaptcha-0.1.0.tar.gz (33.1 kB view details)

Uploaded Apr 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

txtcaptcha-0.1.0-py3-none-any.whl (31.7 kB view details)

Uploaded Apr 11, 2026 Python 3

File details

Details for the file txtcaptcha-0.1.0.tar.gz.

File metadata

Download URL: txtcaptcha-0.1.0.tar.gz
Upload date: Apr 11, 2026
Size: 33.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for txtcaptcha-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`78c410b89fe4bedf665378a8cbd7274609d085cd1422ec1731f4e7c619e63a1c`
MD5	`46f0353c4e485e6d6bba866fc1aa71b5`
BLAKE2b-256	`96e8a1436a0a8ff37ba63aa08f42058f2fa20abd42f837634315a6b1ae554951`

See more details on using hashes here.

Provenance

The following attestation bundles were made for txtcaptcha-0.1.0.tar.gz:

Publisher: publish.yml on jtrecenti/txtcaptcha

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: txtcaptcha-0.1.0.tar.gz
- Subject digest: 78c410b89fe4bedf665378a8cbd7274609d085cd1422ec1731f4e7c619e63a1c
- Sigstore transparency entry: 1280177386
- Sigstore integration time: Apr 11, 2026
Source repository:
- Permalink: jtrecenti/txtcaptcha@a00e96f09c8f6f524ad220fa8673f59d9cf436e0
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/jtrecenti
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a00e96f09c8f6f524ad220fa8673f59d9cf436e0
- Trigger Event: release

File details

Details for the file txtcaptcha-0.1.0-py3-none-any.whl.

File metadata

Download URL: txtcaptcha-0.1.0-py3-none-any.whl
Upload date: Apr 11, 2026
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for txtcaptcha-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`16f523e849a20c8fb1d7d8f0a6597a768d8dfc343a24cc26fdb30b1c2103d27a`
MD5	`0e110cf279637ce9c9e1e0f15bcf9780`
BLAKE2b-256	`4cbc357303f28361c240ff1558087e1003446684db62144a4c5f62b0fbe97c83`

See more details on using hashes here.

Provenance

The following attestation bundles were made for txtcaptcha-0.1.0-py3-none-any.whl:

Publisher: publish.yml on jtrecenti/txtcaptcha

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: txtcaptcha-0.1.0-py3-none-any.whl
- Subject digest: 16f523e849a20c8fb1d7d8f0a6597a768d8dfc343a24cc26fdb30b1c2103d27a
- Sigstore transparency entry: 1280177390
- Sigstore integration time: Apr 11, 2026
Source repository:
- Permalink: jtrecenti/txtcaptcha@a00e96f09c8f6f524ad220fa8673f59d9cf436e0
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/jtrecenti
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a00e96f09c8f6f524ad220fa8673f59d9cf436e0
- Trigger Event: release

txtcaptcha 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

txtcaptcha

Installation

Quick start

Training your own model

Publishing your own model to the Hub

Public API

Architecture

Variable-length labels

Decode-time masking

Notebooks

Tests

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance