Plain-text source listing generator for AI context

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Listing Generator

A tool for building dense contexts from source code: traverses projects, filters and normalizes files, then assembles them into a single clean Markdown document — perfect for ChatGPT/Copilot/Gemini/Claude and other LLM assistants.

In short: you store selection rules in lg-cfg/ (YAML + context templates), and LG renders "ready-to-paste" text or returns a JSON report with token statistics.

Why and Who Is It For

Target audience: developers, team leads, and technical writers who engage in dialogues with AI agents about real code, perform reviews, assign tasks, capture iteration context, while model window size is limited.

Why: modern agents work noticeably better when they see exactly the needed code with minimal noise: no junk from node_modules/, logs, generated files, huge binaries, etc. Manual preparation of such context is painful. LG automates:

selection of relevant files (by filters and extensions),
light normalization (e.g., Markdown headers, "trivial" __init__.py),
assembly into a single document with visible file markers,
.gitignore awareness,
changes mode (only modified files),
templates and contexts (section insertions and nested templates),
size/token estimation and shares ("who's eating the prompt").

There are many ways to form prompts and attach relevant code snippets: from manual copying to context embedding features in IDEs with integrated AI chats. LG differs by doing this systematically and reproducibly: rules are stored in the repository, not in your head or AI conversation history.

You describe what and how goes into the prompt in advance (through sections and templates). This enforces discipline, allows you to "tune" density and avoid overflowing the model window, as well as reproduce successful queries through saved templates.

What a "Healthy" AI Agent Workflow Looks Like

Describe rules in the repository Create lg-cfg/sections.yaml and additional *.sec.yaml as needed. These describe sections (file sets + filters). Use *.tpl.md and *.ctx.md for templates and contexts.
Build context Render: either a "section" (virtual context of one file set), or a "context" (template that can include multiple sections and other templates).
Iteratively compress Check token statistics (who has the "heaviest share"), move secondary content to separate sections, include on demand. For "small updates" use --mode changes.
Save successful prompts Contexts and templates (*.ctx.md and *.tpl.md) are your "well-working" query formats: reproducible, versionable, with variants for different tasks and agents.

Quick Start

Installation and Running

Requires Python ≥ 3.10.

Installation:

# Install from project directory
pip install -e .

Verification:

# Check via module
python -m lg.cli --version

# Or via installed command
lg --version

Environment and cache check:

python -m lg.cli diag
python -m lg.cli diag --rebuild-cache

What Goes in `lg-cfg/`

Important: the configuration directory is always named lg-cfg/.

Example structure:

lg-cfg/
├─ sections.yaml           # main YAML with sections (required)
├─ additional.sec.yaml     # additional section set (can have many)
├─ intro.tpl.md            # template (can have many, in any subfolders)
├─ onboarding.ctx.md       # context (can have many, in any subfolders)
└─ sub-fold/
   └─ extra.sec.yaml

Sections

sections.yaml — file with base sections.
*.sec.yaml — additional section sets (fragments). Version is optional in them.

A section describes:

which file extensions to consider,
allow/block filters over the tree,
policy for empty files, code-fence, and language adapters.

Minimal example:

# Section for project documentation
docs:
  extensions: [".md"]
  markdown:
    # Normalize headings to H2 (outside fenced blocks), remove single H1 at start
    max_heading_level: 2
  filters:
    mode: allow            # default-deny within section
    allow:
      - "/README.md"
      - "/docs/**"

# Core-model submodule sources
core-model-src:
  extensions: [".py", ".md", ".yaml", ".json", ".toml"]
  skip_empty: true
  python:
    skip_trivial_inits: true
  markdown:
    max_heading_level: 3
  filters:
    mode: allow
    allow:
      - "/core-model/**"
    children:
      core-model:
        mode: block
        block:
          - "**/.pytest_cache/**"
          - "/ROADMAP.md"

# Separate section for roadmap (as text)
core-model-roadmap:
  extensions: [".md"]
  filters:
    mode: allow
    allow:
      - "/core-model/ROADMAP.md"

Filters: How They Work

Rule tree — default-allow (mode: block) or default-deny (mode: allow).
At each level: first block, then (if node is allow) — strict check against allow. If mode: allow and path doesn't match local allow, it's immediately rejected.
block is always stronger than allow.
Project's .gitignore is respected.
LG also carefully doesn't descend into subtrees that won't yield anything (early pruner).

Contexts and Templates

Contexts: *.ctx.md (top-level documents).
Templates: *.tpl.md (fragments for insertion).

Example:

# Project Introduction

${tpl:intro}

## Core-model module source code

${core-model-src}

## Additional section

${sub-fold/extra/bar}

## Current task

${task}

Sections from sections.yaml are accessible directly (${docs}), and from fragments — by hierarchical path: file sub-fold/extra.sec.yaml → section bar → ${sub-fold/extra/bar}.

Special placeholder ${task} inserts text from --task argument:

${task} — simple insertion (empty string if not specified)
${task:prompt:"default text"} — with default value
{% if task %}...{% endif %} — conditional block insertion

More details: templates.md.

Language Adapters

Listing Generator uses adapters for different languages and formats. They help "optimize" listings: remove junk, normalize headings, filter paragraphs, or even strip function bodies leaving only signatures. Adapter settings are specified right in section YAML — globally for the section or targeted to specific paths via targets.

Configuration Example

core:
  extensions: [".py", ".md"]
  skip_empty: true

  # Global rules for entire section
  python:
    skip_trivial_inits: true
    strip_function_bodies: false

  markdown:
    max_heading_level: 2

  # Local overrides for specific folders and files
  targets:
    - match: "/pkg/**.py"
      python:
        strip_function_bodies: true      # only signatures in this folder

    - match: ["/docs/**.md", "/notes/*.md"]
      markdown:
        drop:
          sections:
            - match: { kind: regex, pattern: "^(License|Changelog|Contributing)$", flags: "i" }

In this example, the core section describes two languages. For Python, stripping function bodies is globally disabled, but inside the /pkg/ folder it's enabled. For Markdown, a general heading level is set, but in /docs/ and /notes/ paragraphs will additionally be filtered by specified patterns.

The match key accepts either a string or a list of glob patterns. When multiple rules match, the more specific (longer and more concrete) one wins; if equal — the later one in the list. This allows neatly layering local "overrides" on top of section settings.

Separate empty file policy (skip_empty at section level and empty_policy in adapters) works as if it's part of language options: the section sets the general strategy, and the adapter can refine it if needed. Possible values: empty_policy: inherit|include|exclude.

Available Adapters

Markdown

Normalize headings (remove lone H1, shift levels).
Systematically drop entire sections by headings (with subtree).
Remove YAML front matter at the beginning.
Insert placeholders in place of removed content (optionally).

More details: markdown.md.

Programming Languages

More details: adapters.md.

Token Statistics

To facilitate the process of optimizing listings and contexts, LG provides a summary report on token usage.

LG supports several open-source tokenization libraries (tiktoken, tokenizers, sentencepiece) and requires explicit specification of tokenization parameters on each run.

More details: tokenizers.md.

Adaptive Capabilities

All methods for creating universal templates and section configurations are described in the Adaptive Capabilities section.

CLI Options

General format:

lg <command> <target> [--mode MODESET:MODE] [--tags TAG1,TAG2] [<additional_flags>]

# For render/report, tokenization parameters are required:
lg render|report <target> \
  --lib <tiktoken|tokenizers|sentencepiece> \
  --encoder <encoder_name> \
  --ctx-limit <tokens>

Where <target>:

ctx:<name> — takes file lg-cfg/<name>.ctx.md (subfolders supported).
sec:<id> — virtual context of a single section (canonical ID).
<name> — searches first as ctx:<name>, otherwise as sec:<id>.

Commands:

render — output final text only (Markdown).
report — JSON report (format v5): statistics, files, context block.
list contexts|sections|tokenizer-libs|encoders — list available entities (JSON).
diag — environment/cache/config diagnostics (JSON), has --rebuild-cache.

Tokenization parameters:

--lib — tokenization library (tiktoken, tokenizers, sentencepiece)
--encoder — encoder/model name (e.g.: cl100k_base, gpt2, google/gemma-2-2b)
--ctx-limit — context window size in tokens (e.g.: 128000, 200000)

Examples:

# Render context from template with tokenization for GPT-4
lg render ctx:onboarding \
  --lib tiktoken \
  --encoder cl100k_base \
  --ctx-limit 128000 > prompt.md

# Render "section only" (no template)
lg render sec:core-model-src \
  --lib tiktoken \
  --encoder cl100k_base \
  --ctx-limit 128000 > prompt.md

# Same but only changed files in working tree
lg render ctx:onboarding \
  --lib tiktoken \
  --encoder cl100k_base \
  --ctx-limit 128000 \
  --mode vcs:branch-changes > prompt.md

# JSON report with token stats for GPT-4o
lg report ctx:onboarding \
  --lib tiktoken \
  --encoder o200k_base \
  --ctx-limit 200000 > report.json

# Report for Gemini using sentencepiece
lg report ctx:onboarding \
  --lib sentencepiece \
  --encoder google/gemma-2-2b \
  --ctx-limit 1000000 > report.json

# Render context with current task description
lg render ctx:dev \
  --lib tiktoken --encoder cl100k_base --ctx-limit 128000 \
  --task "Implement result caching"

# Multi-line task via stdin
echo -e "Tasks:\n- Fix bug #123\n- Add tests" | \
  lg render ctx:dev --lib tiktoken --encoder cl100k_base --ctx-limit 128000 --task -

# Task from file
lg render ctx:dev \
  --lib tiktoken --encoder cl100k_base --ctx-limit 128000 \
  --task @.current-task.txt

# Diagnostics
lg diag
lg diag --rebuild-cache

# Lists
lg list contexts
lg list sections
lg list tokenizer-libs
lg list encoders --lib tiktoken
lg list encoders --lib tokenizers

How LG Renders Documents

If all files are Markdown/plain text, LG simply concatenates their content.
Otherwise:
- with code-fence (default): blocks by languages, grouped in order of occurrence; inside each block — file marker # —— FILE: path ——, then content.
- without code-fence: linear document with marker before each file.

This makes the prompt readable for humans and convenient for agents: it's clear where each fragment comes from.

Cache and Performance

LG uses file cache .lg-cache:

Processed cache — adapter results + their metadata.
Raw/Processed tokens — saved token counts (by model/mode).
Rendered tokens — count of final document ("with glue") and "sections-only".

Cache keys consider tool version, file fingerprint, adapter config, group composition, etc. Management: lg diag, lg diag --rebuild-cache. Can disable cache via LG_CACHE=0.

Practical Tips for "Dense" Contexts

Keep sections small and thematic. Better several sections than one "everything about everything".
Strict allow nodes use where full content predictability is needed.
Markdown templates apply as prompt "frame": brief intro, tasks, section placeholders.
changes mode — best friend for patch iterations and code review via LLM.
Watch shares (promptShare/ctxShare) in report: helps distribute "holding cost".
Normalize headings (max_heading_level) — makes reading long contexts easier.
Don't drag secrets. Configure block for artifacts/keys/secrets/binaries.

IDE/Plugin Integration

In most cases you'll run LG through integration (VS Code / JetBrains, etc.). Nevertheless, all selection/template logic lives in the repository (lg-cfg/), so:

reviewing and evolving rules is simple (via PRs),
transferring successful prompts between projects — trivial,
same configuration works in CLI and IDE.

License

Listing Generator is licensed under the Apache License, Version 2.0.
See the LICENSE file for the full license text.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Max-Moro

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.10.2

Jan 15, 2026

0.10.1

Jan 1, 2026

This version

0.9.2

Nov 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

listing_generator-0.9.2.tar.gz (230.1 kB view details)

Uploaded Nov 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

listing_generator-0.9.2-py3-none-any.whl (276.9 kB view details)

Uploaded Nov 22, 2025 Python 3

File details

Details for the file listing_generator-0.9.2.tar.gz.

File metadata

Download URL: listing_generator-0.9.2.tar.gz
Upload date: Nov 22, 2025
Size: 230.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for listing_generator-0.9.2.tar.gz
Algorithm	Hash digest
SHA256	`70e21359f62748c82200964a1abec9dd3fdc98c6e14e4971dfece9c9064f8b15`
MD5	`ba59702bd3ac8effff8237790eeeedbd`
BLAKE2b-256	`51b515f4d084f9ddbdbfa400eb4be80ad9a16714161322cb64daca05398ab4bc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for listing_generator-0.9.2.tar.gz:

Publisher: release.yml on Max-Moro/lg-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: listing_generator-0.9.2.tar.gz
- Subject digest: 70e21359f62748c82200964a1abec9dd3fdc98c6e14e4971dfece9c9064f8b15
- Sigstore transparency entry: 716022210
- Sigstore integration time: Nov 22, 2025
Source repository:
- Permalink: Max-Moro/lg-cli@031a0ed6c0614c3af096c7cf3105144cbc2b6f7c
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Max-Moro
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@031a0ed6c0614c3af096c7cf3105144cbc2b6f7c
- Trigger Event: workflow_dispatch

File details

Details for the file listing_generator-0.9.2-py3-none-any.whl.

File metadata

Download URL: listing_generator-0.9.2-py3-none-any.whl
Upload date: Nov 22, 2025
Size: 276.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for listing_generator-0.9.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`08f15347d17e68208561f346aa8e05d3dc2bd5526ab7fb2299471c9ddd0ac236`
MD5	`7fc964deacb339fbd66d6a3e831c5fd8`
BLAKE2b-256	`5e334ae76e60b19f91ea8ce01b9bf591012cecfdb2b5d883188f6a7e0ee9185d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for listing_generator-0.9.2-py3-none-any.whl:

Publisher: release.yml on Max-Moro/lg-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: listing_generator-0.9.2-py3-none-any.whl
- Subject digest: 08f15347d17e68208561f346aa8e05d3dc2bd5526ab7fb2299471c9ddd0ac236
- Sigstore transparency entry: 716022212
- Sigstore integration time: Nov 22, 2025
Source repository:
- Permalink: Max-Moro/lg-cli@031a0ed6c0614c3af096c7cf3105144cbc2b6f7c
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Max-Moro
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@031a0ed6c0614c3af096c7cf3105144cbc2b6f7c
- Trigger Event: workflow_dispatch

listing-generator 0.9.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Listing Generator

Why and Who Is It For

What a "Healthy" AI Agent Workflow Looks Like

Quick Start

Installation and Running

What Goes in lg-cfg/

Sections

Filters: How They Work

Contexts and Templates

Language Adapters

Configuration Example

Available Adapters

Markdown

Programming Languages

Token Statistics

Adaptive Capabilities

CLI Options

How LG Renders Documents

Cache and Performance

Practical Tips for "Dense" Contexts

IDE/Plugin Integration

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

What Goes in `lg-cfg/`