Validate and clean Two-Line Element (TLE) satellite-tracking files

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elfensky

These details have not been verified by PyPI

Project description

lintle

A validator and cleaner for Two-Line Element (TLE) corpus files exported from space-track.org.

It audits a TLE file against the standardized TLE specification, repairs the systematic export defects, and emits a uniform, de-defected corpus that any SGP4 / orbital-mechanics library can ingest directly. Records it cannot safely repair are quarantined — never silently mangled — into a per-file sidecar detailed enough to file a defect report with space-track.

What problem it solves

A TLE record is two fixed-width lines, each exactly 69 ASCII columns, with a mod-10 checksum in column 69. Bulk historical exports from space-track carry two systematic, era-specific defects:

Trailing \ artifact — almost every Line 1 has an extra \ byte appended before the newline.
Missing checksum digit — many records were exported without their column-69 checksum, leaving 68-column lines.

These appear independently and in combination, and a small fraction of records are genuinely corrupt (garbled columns, orphaned lines, wrong lengths). lintle distinguishes the safely-repairable from the genuinely-corrupt and treats each correctly.

How it works

One validator, used two ways. A single module (tle.py) defines what a "perfect" TLE record is — column layout, semantic ranges, and the mod-10 checksum. The validate command reports defects against that definition; the clean command reuses the exact same validator and only emits records that pass it.

The validated-transformation principle. The cleaner never applies a fix and hopes. It applies a candidate fix, then re-runs full validation on the result, and commits the fix only if it now passes. Consequently the cleaner cannot turn a bad record into a wrong-but-valid-looking one, and every line in the output is valid by construction.

Five fix classes, in decreasing order of safety:

Class	Examples	Action
Content-preserving	trailing `\`, CRLF, trailing whitespace	auto-fix (checksum survives as an independent check)
Reconstructed-checksum	a record exported without its column-69 digit	recompute the checksum from intact columns 1–68
Content-shifting	leading whitespace / BOM	trim, then re-validate; quarantine if it fails
Structural	blank / whitespace-only lines	drop, resynchronise pairing
Corrupt	bad checksum, wrong length, orphan line, garbled columns	quarantine

Streaming and parallel. Files are read in binary, line by line, in constant memory — a 3 GB file never loads into RAM. Records are paired by a prefix-driven state machine that resynchronises on every 1 line, so one missing line cannot cascade into mispaired records. Each input file is processed in its own worker process.

Requirements

Python 3.11+
uv for environment and dependency management

lintle itself has no runtime dependencies — it is pure standard library. sgp4 is a dev-only dependency, used as a test oracle.

Installation

uv sync

This creates the virtual environment and installs the dev dependencies. No build step is needed to run the tool.

Usage

The console script is lintle, with two subcommands:

# Audit only — report defects, write nothing
uv run lintle validate [paths...]

# Produce cleaned output + quarantine sidecars
uv run lintle clean [paths...]

python -m lintle ... is equivalent to uv run lintle ....

Arguments and options:

Option	Default	Meaning
`paths`	`data/source`	Files or directories. A directory is globbed for `tle.txt` (tool output `.cleaned.txt` / `*.broken.txt` is excluded).
`--out-dir DIR`	`data/output`	Where `clean` writes its output. Created if absent.
`--jobs N`	CPU count	Number of files processed in parallel. Lower it if a slow disk causes I/O contention.
`--report text\|json`	`text`	Summary format.

Examples:

# Validate the whole corpus
uv run lintle validate data/source

# Clean one file
uv run lintle clean data/source/tle2022.txt --out-dir data/output

# Clean the corpus, capture a machine-readable summary
uv run lintle clean data/source --report json > run-summary.json

Exit codes:

Code	Meaning
`0`	No records quarantined — clean (or every defect repaired).
`1`	At least one record was quarantined.
`2`	Operational error — no input files, disk shortfall, or a file that failed to process.

Repairable defects (including the near-universal trailing \) do not raise the exit code above 0 — almost every raw file contains them.

Output

A clean run lays --out-dir out like this:

<out-dir>/
├── cleaned/   tleYYYY.cleaned.txt   — one per input file
├── broken/    tleYYYY.broken.txt    — one per input file
└── report.md  — corpus-wide run report

cleaned/tleYYYY.cleaned.txt — standard 2-line TLE text, every record verified valid: 69 ASCII columns per line, \n-terminated, matching satellite catalog numbers, valid checksums. World-readable, ready for downstream ingestion.
broken/tleYYYY.broken.txt — the quarantine sidecar. Each entry records the source line number(s), a human-readable reason, and the offending line(s) copied byte-faithfully. The header carries totals, a timestamp, and the tool version — formatted to paste into a space-track defect report.
report.md — a Markdown run report aggregating the whole run: corpus totals, the percentage cleaned and quarantined, corpus-wide fix counts, the defect-category breakdown, and a per-file table.

A run summary is also printed per file to stdout (and as JSON with --report json):

tle2022.txt   8,412,067 records   8,412,064 clean   3 quarantined
  fixes:   trailing-backslash 8,412,064 | reconstructed-checksum 195,293
  rejects: checksum-mismatch 1 | orphan-line 1 | wrong-length 1

reconstructed-checksum is reported separately from content-preserving fixes: those records are format-conformant, but their checksums are computed, not independently verified.

validate writes nothing — it only prints the per-file summary and the locations of defective records to stdout.

Progress

A 30 GB run is not silent. Live progress is written to stderr as it goes — so it never pollutes the stdout summary or a --report json pipe:

processing 29 file(s) with 10 worker(s)...
  tle2004_7of8.txt: 5,000,000 records...
[3/29] tle2004_3of8.txt — 2,527,820 clean, 183 quarantined

A worker emits a record-count line every 1,000,000 records; the main process prints an [k/N] line as each file finishes.

Results on the bundled corpus

A full run over the 29-file corpus (tle2004–tle2025, ~232 million records):

99.96 % cleaned — 187.9 M trailing-\ artifacts stripped, 71.3 M missing checksums reconstructed
0.044 % quarantined (103,228 records) as genuinely corrupt — every reject fell into an anticipated category; no unknown defect type surfaced

Development

uv sync                          # install dev dependencies
uv run pytest                    # run the test suite
uv run pytest --cov=lintle       # with a coverage report
uv run ruff check                # lint
uv run ruff format               # auto-format

The suite includes unit tests per module, an asymmetric cross-check against the trusted sgp4 parser (a known-good TLE must be accepted by both), and end-to-end integration tests (golden output, idempotence, re-validation).

Code quality is enforced with ruff (lint rule sets E, F, I, UP, B, SIM; 88-column lines) and coverage is measured with pytest-cov.

Project layout

src/lintle/
  tle.py        # core: defines a "perfect" TLE record (pure, no I/O)
  repair.py     # speculative, validated repairs
  pipeline.py   # streaming reader, prefix-driven pairing, per-file routing
  report.py     # quarantine sidecar + run-summary rendering
  cli.py        # argument parsing, parallelism, exit codes
tests/          # pytest suite
docs/superpowers/
  specs/        # the design specification
  plans/        # the implementation plan
  runs/         # corpus-run summaries

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

elfensky

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

May 24, 2026

This version

0.1.1

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lintle-0.1.1.tar.gz (101.2 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lintle-0.1.1-py3-none-any.whl (21.7 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file lintle-0.1.1.tar.gz.

File metadata

Download URL: lintle-0.1.1.tar.gz
Upload date: May 22, 2026
Size: 101.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lintle-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`beebb4e9ea74386f03176e96170725d93bc2de64ff87745aa3ab9c716324553c`
MD5	`f4415df7409ea3048085b799cb1cc6e7`
BLAKE2b-256	`9470c919988df7a5a32d790d58d93d9302561cef1fc94fd1821e92193155cd39`

See more details on using hashes here.

File details

Details for the file lintle-0.1.1-py3-none-any.whl.

File metadata

Download URL: lintle-0.1.1-py3-none-any.whl
Upload date: May 22, 2026
Size: 21.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lintle-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15c65a5311b7892c401d903ad5feca36dfbc2c1f7475ec604fda91ec71b71d40`
MD5	`74bded9b3fed3616a9aa01b6303752e2`
BLAKE2b-256	`ad1ccf1202b0c7c03063e7ce8d0b0c3b7324e57fac0aae9448286cab1c1455fd`

See more details on using hashes here.

lintle 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

lintle

What problem it solves

How it works

Requirements

Installation

Usage

Output

Progress

Results on the bundled corpus

Development

Project layout

Further reading

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes