Skip to main content

A Python CLI to inspect, validate, and manage license and copyright headers.

Project description

TopMark

PyPI version Documentation Status Downloads GitHub release

TopMark is a command-line tool to inspect, validate, and manage file headers in diverse codebases. It helps maintain consistent header metadata across projects by supporting per-file-type header formats, customizable fields, inclusion/exclusion rules, and dry-run safety.

✨ Features

  • File header detection, insertion, and replacement
  • Supports multiple file types (Python, Makefile, Markdown, .env, ...)
  • Configurable comment styles and header fields
  • Inclusion and exclusion logic (via CLI, globs, stdin, or config)
  • Dry-run by default; safe for CI/CD integration
  • Configuration via pyproject.toml or topmark.toml
  • Shell completion for enum-based options
  • Colorized CLI output (via yachalk)
  • Python ≥3.10
  • Integrated pre-commit hooks for automated checks
  • Formatting and linting support via Makefile targets
  • CI-friendly design for safe automated use
  • Strict static typing with mypy and Pyright, using PEP 604 union syntax
  • Google-style docstrings without redundant type declarations
  • Full header removal (topmark strip)
  • Preserves original newline style (LF/CRLF/CR) and BOM
  • Idempotent updates (re-running does not change already-correct files)

🚀 Installation

git clone https://github.com/shutterfreak/topmark.git
cd topmark
make setup  # creates virtualenv, installs dependencies and tools

Or install into an existing virtualenv:

pip install -e .

Or install the latest release from PyPI:

pip install topmark

⚙️ Usage

topmark [OPTIONS] [PATHS]...

TopMark uses Click 8.2 and supports shell completions. The base command performs a dry‑run check by default and applies changes when --apply is provided.

The strip subcommand is provided to remove entire headers.

Logging verbosity is controlled globally:

  • -v, --verbose: Increase verbosity (can be repeated)
  • -q, --quiet: Suppress most output (overrides verbosity)

All other options, such as --stdin, --file-type, and path filters, are specific to individual subcommands like check or apply.

Skipping and filtering helpers

These switches help keep CI output meaningful and fast:

  • --skip-compliant — hide files that are already compliant; only show items that need attention.
  • --skip-unsupported — hide unsupported file types and files matched by known types that intentionally do not support headers (e.g., strict JSON). They remain recognized but are not listed in results.

Examples

# Check Python files in the src/ directory
topmark --file-type python src/

# Use exclusion patterns, and compute relative paths from src
topmark --file-type python --exclude .venv --relative-to src src/

# Add one verbosity level to topmark, use exclusion patterns from .gitignore
topmark -v --file-type python --exclude-from .gitignore src/

# Read files from stdin, generate summary
find . -name "*.py" | topmark --file-type python --summary --stdin

# Process all files in a Git repo
git ls-files -c -o --exclude-standard | sort -u | topmark --stdin --apply

# Dump the merged configuration (after loading all applicable config layers)
topmark dump-config --file-type python --exclude .venv --exclude-from .gitignore

# Display the default configuration without any merging
topmark show-defaults

# Output a starter configuration to stdout
topmark init-config

# Show TopMark version
topmark version

# Apply changes to files in-place
topmark --apply src/

# Remove headers from files (dry-run)
topmark strip src/

# Remove headers from files and apply changes
topmark strip --apply src/

# CI-friendly summary: only show issues; ignore unsupported types
topmark --skip-compliant --skip-unsupported src/

# Apply fixes for only the changed files passed by pre-commit
topmark --apply --apply --skip-unsupported --quiet

📐 Header placement rules

TopMark is comment-aware and places the header block according to the file type and its policy.

Pound-style files (e.g., Python, Shell, Ruby, Makefile, YAML, TOML, Dockerfile)

Rules:

  • If a shebang is present (e.g., #!/usr/bin/env python3), place the header after the shebang and ensure exactly one blank line in-between.
  • If a coding/encoding line follows the shebang (PEP 263 style), place the header after shebang and encoding line.
  • Otherwise, place the header at the top of the file.
  • Ensure one trailing blank line after the header block when the next line is not already blank.

Example (Python):

#!/usr/bin/env python3

# topmark:header:start
#
#   file         :
#   file_relpath :
#
# topmark:header:end

print("hello")

XML-style files (XML, HTML/XHTML, SVG, Vue/Svelte/Markdown via HTML comments)

Rules:

  • If present, place the header after the XML declaration and DOCTYPE, with one blank line before the header block.
  • Otherwise, place the header at the top of the file.
  • The header uses the file’s native comment syntax; for XML/HTML it’s a comment block wrapper:
<!--
topmark:header:start

  file         :
  file_relpath :

topmark:header:end
-->

<html>...</html>

General guarantees

  • Newline preservation: The inserted header uses the same newline style as the file (LF/CRLF/CR).
  • BOM preservation: If a UTF‑8 BOM is present, it is preserved.
  • Idempotency: Re-running TopMark on a file with a correct header makes no changes.

Common Options

The following options can be used with most commands.

Option Description
--file-type Specify file type (python, markdown, …)
--relative-to Set base path for relative header fields
--include Include paths or glob patterns
--include-from Read inclusion patterns from file
--exclude Exclude paths or glob patterns
--exclude-from Read exclusion patterns from file
--stdin Read file paths from stdin
--apply Actually modify files instead of dry-run
-v, --verbose Increase verbosity (can be repeated)
-q, --quiet Suppress most output

Subcommands

Command Description
dump-config Show the resolved configuration in TOML format
filetypes List supported file types and their comment styles
strip Remove TopMark headers from files (destructive)
version Print TopMark version
show-defaults Show default config (without merging)
init-config Output a starter configuration file

🧩 Supported file types

Processor File types (examples)
PoundHeaderProcessor dockerfile, env, git-meta, ini, julia, makefile, perl, python, python-requirements, r, ruby, shell, toml, yaml
SlashHeaderProcessor c, cpp, cs, go, java, javascript, kotlin, rust, swift, typescript, vscode-jsonc
XmlHeaderProcessor html, markdown, svelte, svg, vue, xhtml, xml, xsl, xslt, yaml

Some formats (e.g., strict JSON) are recognized but intentionally skipped because they lack a safe comment syntax. Use --skip-unsupported to hide them from the report while keeping safety.

For a complete list, please run:

topmark filetypes

How TopMark resolves file types (specificity & safety)

TopMark may have multiple FileType definitions that match a given path. The resolver now:

  • evaluates all matching file types and scores them by specificity,
  • prefers explicit filenames / tail subpaths (e.g., .vscode/settings.json) over patterns, and patterns over simple extensions,
  • breaks ties in favor of headerable types (those without skip_processing=True).

Tail subpath matching. FileType.filenames entries that contain a path separator (e.g., ".vscode/settings.json") are matched as path suffixes against path.as_posix(); plain names still match the basename only.

JSON vs JSONC. Generic json is recognized but marked skip_processing=True (no comments in strict JSON), while vscode-jsonc is a safe, narrow opt‑in that uses // headers. If you need more JSON-with-comments files, add them via a dedicated FileType or an explicit allow‑list in config.

Shebang‑aware insertion. The default insertion logic is policy‑driven and shebang‑aware (insert after #! and optional encoding line). For formats like XML that need character‑precise placement, processors provide a text‑offset path; XmlHeaderProcessor uses this and signals no line anchor.

🪝 Pre-commit integration

TopMark provides first-class pre-commit hooks. A minimal consumer configuration:

# .pre-commit-config.yaml (in a consuming repository)
repos:
  - repo: https://github.com/shutterfreak/topmark
    rev: v0.2.0   # pin to a released tag
    hooks:
      - id: topmark-check
        # Optional: limit scope to supported text types
        # files: '\\.(py|md|toml|ya?ml|sh|Makefile)$'

Hooks shipped by this repo:

  • topmark-check — non-destructive validation. Recommended on pre-commit / pre-push.
  • topmark-apply — destructive, requires --apply. Marked manual so it only runs when you call it explicitly.

Run the manual hook locally:

# On the whole repo
pre-commit run topmark-apply --all-files --hook-stage manual

# On specific files
pre-commit run topmark-apply --files path/to/file1 path/to/file2 --hook-stage manual

Why does the hook seem to run multiple times? Pre-commit batches filenames to avoid OS argument-length limits. You may see repeated banners (e.g., “Processing N file(s)”) as the hook runs once per batch. To run once per repo, set pass_filenames: false in the hook manifest and let TopMark discover files itself.

🛠 Configuration

You can specify one or more --config files, or rely on local fallback resolution:

  • topmark.toml in the working directory
  • pyproject.toml using the [tool.topmark] table

TopMark reads configuration from one or more TOML files. Configuration is merged from:

  1. Built-in defaults
  2. Local project config (if not disabled via --no-config)
  3. Additional files via --config
  4. CLI overrides

Example configuration snippet (topmark.toml):

[fields]
project = "TopMark"
license = "MIT"
copyright = "(c) 2025 Olivier Biot"

[header]
fields = [ "file", "file_relpath", "project", "license", "copyright",]

[formatting]
align_fields = true
raw_header = false

[files]
file_types = [ "python", "markdown", "env" ]
include = []
include_from = []
exclude = []
exclude_from = [ ".gitignore" ]
relative_to = "."

Notes

  • formatting.align_fields = true vertically aligns the field names within the rendered header lines for readability.
  • File-type specific behavior (shebang handling, XML prolog, blank line policies) is driven by internal FileTypeHeaderPolicy defaults and can be extended to new types.

The EnumParam class enables shell completion for enum-based CLI options.

🧪 Development Setup

For development setup and contribution guidelines, see CONTRIBUTING.md.

To verify compatibility across supported Python versions (3.10–3.13), use tox to run tests and type checks in each environment.

📄 License

MIT License © 2025 Olivier Biot

Markdown formatting is handled by mdformat with the mdformat-tables plugin, and configuration is read from pyproject.toml.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

topmark-0.2.1.tar.gz (81.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

topmark-0.2.1-py3-none-any.whl (102.4 kB view details)

Uploaded Python 3

File details

Details for the file topmark-0.2.1.tar.gz.

File metadata

  • Download URL: topmark-0.2.1.tar.gz
  • Upload date:
  • Size: 81.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for topmark-0.2.1.tar.gz
Algorithm Hash digest
SHA256 454f0d2be9dfd799cb8870dc93722902cd3b427af51b0352686ef9f342bd69fa
MD5 f0df95a148bdade8c56cc499e14c4c8d
BLAKE2b-256 96a5bab77f21de3a485300922cea8d450e4c43a25d6b3f0aff39fd70ba8bab8c

See more details on using hashes here.

Provenance

The following attestation bundles were made for topmark-0.2.1.tar.gz:

Publisher: release.yml on shutterfreak/topmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file topmark-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: topmark-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 102.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for topmark-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f31da3e7fdf299c9b67283c142375edd2319251e93ac164758bde87140692832
MD5 9df232e090aeda27f3c54b2383fba7ed
BLAKE2b-256 a19f193724aaf404a84b1722d7c44655c155f1bb2165c567fe2dbce9d0227b65

See more details on using hashes here.

Provenance

The following attestation bundles were made for topmark-0.2.1-py3-none-any.whl:

Publisher: release.yml on shutterfreak/topmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page