Skip to main content

Scan a project directory and collect per-filetype statistics into a markdown info file.

Project description

mamba-scan

Scan your project once, get per-filetype stats in a markdown file, move on.

Python License Version Community


The black mamba is one of the fastest snakes on earth. That's the idea here: install, run once, uninstall.
No config files, no database, no background process. Just a snake that bites once and leaves.


The Problem

At some point in a project you want to know what you actually have. How many Python files, how many lines,
how much of it is documentation versus actual code. The answer is usually "open a terminal and start counting
by hand" or "find a tool that requires three config files and a GitHub token to get started."

I wanted something I could drop into any project for thirty seconds, run once, and have a readable result
sitting in the project root. Not a dashboard, not a CI integration, not a persistent dependency.
Just a number, written to a file, done.

mamba-scan does exactly that. One import, one call, one markdown block appended to a file in your project root.
Then you uninstall it and forget it existed.


Features

  • One-call API - count_project() with no required arguments covers the most common case immediately
  • Per-filetype stats - lines, chars, doc chars, code chars, and type-specific counts broken out per extension
  • Multi-language support - built-in analyzers for Python, HTML/Jinja2, JavaScript/TypeScript, CSS/SCSS,
    Java, C, C++, and C#
  • Append-only output - results are appended with a rounded timestamp so repeated runs don't overwrite history
  • Flexible scoping - restrict or exclude extensions and directories without touching any config file
  • Extensible - subclass FileAnalyzer, set extensions, implement analyze() - registration is automatic
  • Unmatched file reporting - files with no specific analyzer are listed once at the end so nothing is silently missed

Installation

pip install mamba-scan

Or directly from the repository:

pip install git+https://github.com/soss-community/mamba-scan.git

Usage

Quickstart

from mamba_scan import count_project

count_project()

That's it. mamba-scan finds the project root automatically, scans all default file types,
and appends a stats block to project_info.md in the project root.

Filtering what gets scanned

# Only these extensions
count_project(scanned_extensions=[".py", ".html", ".css"])

# Everything default except these
count_project(ignored_extensions=[".md", ".json"])

# Only files inside these directories
count_project(only_dirs=["src", "app"])

# Default directories plus these excluded on top
count_project(ignored_dirs=["migrations", "legacy"])

# Custom output file
count_project(output_filename="my_stats.md")

scanned_extensions and ignored_extensions are mutually exclusive.
only_dirs and ignored_dirs are mutually exclusive.

For full working examples see:


Built-in Analyzers

Each analyzer handles one or more extensions and tracks type-specific stats on top of the shared base stats.

Analyzer Extensions Extra stats
PythonAnalyzer .py, .pyw classes, functions, docstrings, inline comments
HtmlAnalyzer .html, .htm, .jinja, .jinja2, .j2 html comments, jinja2 comments
JsTsAnalyzer .js, .ts, .jsx, .tsx jsdoc blocks, multiline comments, inline comments
CssAnalyzer .css, .scss, .sass block comments
JavaAnalyzer .java classes, methods, javadoc blocks, multiline comments, inline comments
CAnalyzer .c, .h functions, block comments, inline comments
CppAnalyzer .cpp, .cc, .cxx, .hpp, .hxx classes, functions, block comments, inline comments
CSharpAnalyzer .cs classes, methods, xml doc lines, block comments, inline comments
GenericAnalyzer fallback for everything else basic line and char counts only

All analyzers share the base stats: total_lines, empty_lines, total_chars, doc_chars, code_chars.


Custom Analyzers

Subclass FileAnalyzer, set extensions as a class attribute, implement analyze() as a classmethod.
Registration happens automatically when Python evaluates the class body - no instantiation needed.

import re
from mamba_scan import FileAnalyzer, count_project


class LuaAnalyzer(FileAnalyzer):
    extensions = [".lua"]

    @classmethod
    def analyze(cls, content: str) -> dict:
        total_chars = len(content)
        sl_matches = list(re.finditer(r"--(?!\[).*$", content, re.MULTILINE))
        block_matches = list(re.finditer(r"--\[\[[\s\S]*?\]\]", content))
        doc_chars = sum(len(m.group(0)) for m in sl_matches + block_matches)

        return {
            "total_lines": len(content.splitlines()),
            "empty_lines": sum(1 for l in content.splitlines() if not l.strip()),
            "total_chars": total_chars,
            "doc_chars": doc_chars,
            "code_chars": total_chars - doc_chars,
            "inline_comments": len(sl_matches),
            "block_comments": len(block_matches),
        }


# LuaAnalyzer is registered at this point - just call count_project
count_project(scanned_extensions=[".py", ".lua"])

Registering the same extension twice raises a ValueError immediately.


Output Format

Each run appends one block to the output file. The block is wrapped in separator lines
and timestamped to the nearest 5 minutes (UTC), so multiple runs stack cleanly without overwriting.

------------------------------

## Project Info Count - 2026-05-04 14:30 UTC

### Config
- Root: /home/user/my_project
- Scanned types: .css, .html, .js, .py
- Ignored dirs: .git, .venv, __pycache__, ...

### General
- Total files: 42
- File types found: .css, .html, .js, .py
- Total lines: 8310
- Empty lines: 1204
- Total chars: 241500
- Doc chars: 38200
- Code chars: 203300

### Python (.py)
- Files: 18
- Lines: 4200
- ...

------------------------------

What this is not for and what we excluded

Feature Reason
Continuous monitoring mamba-scan is a one-shot tool, not a background service
Git integration Diff stats, blame, or history are out of scope
Code quality / linting No AST analysis, no style checks, no complexity metrics
Test coverage Use coverage.py for that
Precise AST-based counting Regex-based heuristics are good enough for an overview, not for hard analysis
Network or remote scanning Local filesystem only

Compatibility

Python 3.10 or higher. No external dependencies. Works on Windows, macOS, and Linux.


License

MIT.
Always refer to the latest version at: github.com/soss-community


Community and contributing

Part of the Sora Open Source Software community.
Visit github.com/soss-community
for contribution guidelines, issue tracking, and discussion.

Author: sora7672
Organization: soss-community
Website: soss.page

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mamba_scan-1.0.0.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mamba_scan-1.0.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file mamba_scan-1.0.0.tar.gz.

File metadata

  • Download URL: mamba_scan-1.0.0.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for mamba_scan-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e47c6475fdbb9ec25e64a58e5738725ff2c955f0a5d69578e1efb00c973d97f7
MD5 b3ad57cc3d3cdcdee800799c317b3759
BLAKE2b-256 fb52cbffc2b35ceae5b367305871895679e9e13f2e0d22e15323c1cb3c811888

See more details on using hashes here.

File details

Details for the file mamba_scan-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: mamba_scan-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for mamba_scan-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 870ac4afc241ca58899a63a13563c0fc2c84c1b8bd521161670990797bd8e8ef
MD5 fc1f2e89d2c348575c7dc9b2d05da206
BLAKE2b-256 ba1966bc9573264ece998fe2501c5804f37a579bc87941893e843a6efda0209e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page