Skip to main content

Press repositories, code files, and HTML into compact, prompt-ready context bundles.

Project description

TextPress (tpress)

Press repositories and code into compact, prompt-ready context bundles for LLMs.

PyPI version Python 3.11+ License: MIT Tests

Why tpress?

When working with LLMs, you often need to share code context. Copy-pasting files is tedious, and full repos exceed token limits. tpress solves this by:

  • 📦 Bundling entire repos into a single markdown file
  • 🎯 Prioritizing code over data files (Python/JS first, JSON last)
  • 🔐 Redacting secrets automatically before sharing
  • ✂️ Compressing intelligently to fit token budgets

Install

pip install tpress

Quick Start

# Bundle current directory → bundle.md
tpress .

# Bundle with custom output path
tpress . -o context.md

# Compact a single file
tpress README.md -o readme.txt

# Custom token limit
tpress . -t 50000

Usage

Default Command (Auto-detect)

tpress auto-detects whether you're passing a file or directory:

tpress .                    # Directory → repo mode
tpress src/main.py          # File → file mode
tpress /path/to/project     # Directory → repo mode

Repository Bundling

tpress repo . -o bundle.md

Output:

✓ bundle.md · 47k tokens · 19 files

Adaptive Bundling

tpress uses relevance-based adaptive bundling by default:

  • File Ranking (0 = most relevant):

    • Rank 0: Core code (.py, .js, .ts, .go, .rs)
    • Rank 1-4: Compiled, shell, SQL, config
    • Rank 5-6: Documentation, HTML/CSS
    • Rank 7-8: Data files (.json, .csv)
  • Smart Ordering: Most relevant files appear first in the bundle

  • Token Budget: Default 100k limit with intelligent reduction:

    1. Summarize large data files (JSON/CSV → schema only)
    2. Compress lower-priority files (strip docstrings/comments)
    3. Exclude lowest-relevance files if needed
# Verbose output
tpress . -v

# Very verbose (full analysis)
tpress . -vv

# Disable adaptive bundling
tpress repo . --no-adaptive

Filtering Modes

tpress repo .           # Default: code, config, docs
tpress repo . --strict  # Code only
tpress repo . --all     # Everything including data files

Split Large Repos

tpress repo . --split -t 50000
# Creates: bundle_1.md, bundle_2.md, ...

Single File Compaction

tpress main.py                    # Output to stdout
tpress main.py -o compact.py      # Output to file
tpress main.py -p llm-safe        # With secret redaction

HTML Extraction

tpress html page.html -o text.md

Profiles

Profile Description
lossless Conservative normalization (whitespace only)
readable Unwrap paragraphs, light cleanup
llm Aggressive (dedupe, structure-aware)
llm-safe Same as llm + automatic secret redaction

Secret Redaction

The llm-safe profile automatically detects and redacts:

  • API keys (OpenAI, AWS, Google Cloud, Azure)
  • Tokens (GitHub, Slack, JWT)
  • Private keys and connection strings
  • High-entropy strings
# Force redaction on any profile
tpress . --redact

# Disable redaction
tpress . --no-redact

CLI Reference

tpress [PATH] [OPTIONS]

Arguments:
  PATH          File or directory to process

Options:
  -o, --out PATH          Output file path
  -c, --copy              Copy output to clipboard (no file created)
  -p, --profile TEXT      Compression profile [default: llm]
  -t, --token-limit INT   Soft token limit [default: 100000]
  -v, --verbose           Verbosity: -v steps, -vv detailed
  --redact/--no-redact    Enable/disable secret redaction
  -V, --version           Show version
  --help                  Show help

Subcommands:
  repo     Bundle a repository (full options)
  file     Compact a single file
  html     Extract text from HTML

Clipboard Support

Copy output directly to clipboard instead of writing a file:

# Copy bundle to clipboard
tpress . -c

# Works with all modes
tpress file.py -c
tpress doc.html -c --html

Token Counting

Token counts are displayed after each operation:

✓ bundle.md · 47k tokens · 19 files

Tokens are counted using tiktoken (GPT tokenizer).

Roadmap

  • Repository bundling with adaptive token limits
  • Relevance-based file ordering
  • Secret redaction
  • HTML pipeline: extract | clean | outline
  • Language-aware strategies (AST-based)
  • Custom redaction patterns via config

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpress-0.2.0.tar.gz (54.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpress-0.2.0-py3-none-any.whl (41.7 kB view details)

Uploaded Python 3

File details

Details for the file tpress-0.2.0.tar.gz.

File metadata

  • Download URL: tpress-0.2.0.tar.gz
  • Upload date:
  • Size: 54.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for tpress-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0f027473ae5e596e75c0f113645d87a0d9d10536a489d5cf6d931146a80eb447
MD5 55b137326db13f284ad8e5a73bc8a4da
BLAKE2b-256 e0ce97ce76d371747605926f3ac3ad28b0b240f28b3e5b93f3b695341e6208ac

See more details on using hashes here.

File details

Details for the file tpress-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tpress-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for tpress-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a82ce208e6aa74a0d79535742292f05dcfb376cd28df6cddf78b943ef801d94
MD5 207c0e77e446022b253f84eda010215e
BLAKE2b-256 6973afd345ddbbfa96021cd0dcf56553ca3330196bdd0726a2ab642f955cf6e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page