Skip to main content

AI-ready code-base scanner that outputs Markdown or XML.

Project description

scanc

Test Status

scanc = scan c(ode)
A fast, pure‑Python project code‑scanner that outputs clean, AI‑ready Markdown or XML.

scanc helps you spill an entire codebase into an LLM prompt (or a file) in seconds—while keeping noise low, controlling token budgets, and giving you full visibility.


Features

Feature Description
Blazing Fast, Pure‑Python Zero native dependencies; easy to install and run anywhere.
Smart Default Ignores Automatically skips node_modules, .venv, .git, and more.
Flexible Filters Include/exclude by extension, filename, or regex patterns.
Optional Directory Tree Prepend a fenced tree diagram of your project structure.
Token Counter Estimate LLM token costs with tiktoken before you paste.
Cross‑Platform CLI Works on macOS, Linux, and Windows out of the box.

Installation

# Optional: Use a virutal environment
python3 -m venv --prompt scanc-env .venv
source .venv/bin/activate

pip install scanc[tiktoken]  # installs optional token‑counter support

Quickstart

Scan a directory and emit Markdown:

scanc .                         # scan current folder
scanc -e py,js --tree           # only .py and .js files + directory tree
scanc -f xml                    # output scan in xml format (new in v1.2.0)
scanc -e py -x "tests" | less   # only py files exclude tests in path
scanc --tokens gpt-4o           # show token count for gpt 4o only
scanc -e py | pbcopy            # scan and copy (macOS copy command example)

Write output directly to a file:

scanc -e ts --tree -o scan.md src/
cat scan.md

CLI Reference

scanc [OPTIONS] [PATHS...]
  • -e, --ext EXTS Comma‑separated extensions to include (e.g. py,js).
  • -i, --include-regex Regex patterns to include (full path match).
  • -x, --exclude-regex Regex patterns to exclude (full path match).
  • --no-default-excludes Disable built‑in ignore list.
  • -t, --tree Prepend directory tree (fenced code block).
  • -T, --tokens MODEL Output only token count for given LLM model.
  • --max-size BYTES Skip files larger than BYTES (default 1 MiB).
  • --follow-symlinks Traverse symlinks when scanning.
  • -o, --out OUTFILE Write result to OUTFILE instead of stdout.
  • -f, --format FORMAT Output format (default: markdown).
  • -V, --version Show version and exit.

Integration & Extensibility

  • Formatter Hook: Customize output by passing your own formatter via entry points.
  • Extras: Use scanc[tiktoken] to enable token counting; more extras may follow.

Docker usage

A ready-to-run container is published to GitHub Container Registry (GHCR). It runs as non-root and scans the mounted host directory by default.

Pull

docker pull ghcr.io/mqxym/scanc:latest

Scan the current project (read-only mount)

# Linux/macOS (Bash/Zsh)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc:latest .

# Windows PowerShell
docker run --rm -v "${PWD}:/work:ro" ghcr.io/mqxym/scanc:latest .

Because the container’s WORKDIR is /work and ENTRYPOINT is scanc, passing . scans your host’s current folder.

Write output to a file

Either redirect on the host:

docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc:latest -e py --tree . > scan.md

...or mount as writable and write into /work:

docker run --rm -v "$PWD":/work ghcr.io/mqxym/scanc:latest -e py --tree -o /work/scan.md .

Tip (Linux/macOS): preserve file ownership when writing by mapping your UID/GID

docker run --rm \
  --user "$(id -u)":"$(id -g)" \
  -v "$PWD":/work ghcr.io/mqxym/scanc:latest -o /work/scan.md .

Examples

# Only Python & JS files, include directory tree
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc:latest -e py,js --tree .

# Token count only (requires optional 'tiktoken' which is baked into the image)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc:latest --tokens gpt-4o .

Licence

Released under the MIT Licence. See LICENCE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scanc-1.2.2.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scanc-1.2.2-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file scanc-1.2.2.tar.gz.

File metadata

  • Download URL: scanc-1.2.2.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scanc-1.2.2.tar.gz
Algorithm Hash digest
SHA256 e2a8f972ae09faf3b4987e3edb4a13881e8d0e5dbe889fb9cdfcfe9416ceea66
MD5 c1996694a0835b375e7e1aa0c92f2266
BLAKE2b-256 d4016cb3459a149465717d8463d5cc6303060de0668569242049083b23afeb0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for scanc-1.2.2.tar.gz:

Publisher: python-publish.yml on mqxym/scanc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scanc-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: scanc-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scanc-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e501d867fc06929c42ed4cead41a86208424d5b3af0016ba4e70580377e5a225
MD5 7fb061d19e8efc2baa42a25f2aa588da
BLAKE2b-256 16c48adb2187ba7523725321843cac963ee208e8529dd6ca42c5afd524c456ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for scanc-1.2.2-py3-none-any.whl:

Publisher: python-publish.yml on mqxym/scanc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page