Skip to main content

Production-grade code normalization tool for encoding, newlines, and whitespace hygiene.

Project description

CODE - Code Normalization Tool

Python CLI that cleans up source code encoding, line endings, and whitespace across entire codebases -- with parallel processing, SHA256 caching, and pre-commit hook support.

  • Location: C:\Dev\PROJECTS\CODE
  • Status: v3.0 code complete. Package stub ready. No pyproject.toml = blocked from PyPI.
  • Updated: 2026-03-10

What It Does

Run it against any directory and it will:

  1. Detect and convert file encoding to UTF-8 (handles utf-8, utf-8-sig, utf-16, utf-16-le, utf-16-be, windows-1252, latin-1, iso-8859-1)
  2. Fix line endings -- CRLF to LF
  3. Strip trailing whitespace from every line
  4. Ensure a single newline at end of file
  5. Optionally validate syntax for Python, JS, TS, Go, Rust, C, C++, Java

Files already clean are skipped. SHA256 caching means repeat runs on unchanged files are near-instant. Multi-core parallel mode handles large codebases at 80-200 files/sec.


Quick Start

Set-Location C:\Dev\PROJECTS\CODE

# See what would change without touching anything
python main.py C:\path\to\project --dry-run

# Normalize everything in-place using all CPU cores
python main.py C:\path\to\project --parallel --in-place

# Normalize only Python and JavaScript files
python main.py C:\path\to\project -e .py -e .js --in-place

# Review and approve each file before it's written
python main.py C:\path\to\project --interactive

# Run syntax validation after normalizing
python main.py C:\path\to\project --in-place --check

# Install a pre-commit hook into a git repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook

main.py at root is a thin wrapper that delegates to src/code_normalize_pro.py. Call either one -- same result.


Pre-Commit Hook

Checks only staged files before each commit. Blocks commit if any need normalization and prints the fix command.

# One-time install per repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook

# Commit as normal -- hook fires automatically
git commit -m "your message"

# Skip hook for one commit
git commit --no-verify -m "your message"

Performance

Files Sequential Parallel 4-core Speedup
100 3.2s 1.1s 2.9x
500 16.8s 4.3s 3.9x
1000 33.5s 7.1s 4.7x

8 cores: 150-200 files/sec. SHA256 cache on unchanged files: 500-1000 files/sec. Workers default to CPU count. Override with --workers N.


Testing

Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1

python -m pytest -q
python main.py --help

Test files in tests/ cover the main tool plus all four launch/sales scripts. All 5 features tested on 2026-02-09 (see docs/TEST_REPORT.md). Manual confirmation of interactive mode still pending.


Project Layout

CODE/
  main.py                          -- Root entrypoint. Delegates to src/code_normalize_pro.py
  src/
    code_normalize_pro.py          -- v3.0 Pro. 917 lines. The active tool.
    code_normalize_v2.py           -- v2.0. Kept for reference.
  code_normalizer_pro/             -- PyPI package stub
    __init__.py                    -- Exposes __version__ = "3.0.1"
    cli.py                         -- Console entry point (calls src/code_normalize_pro.py)
    README.md
  config/
    settings.py                    -- Env-var settings loader (not wired up yet)
  docs/
    README.md                      -- Full feature reference docs
    TEST_REPORT.md                 -- Test results from 2026-02-09
    ARCHITECTURE.md                -- Stub
    launch/                        -- Outreach templates, user tracking CSV, metrics JSON
    sales/                         -- Pricing, pipeline CSV, customer offer template
    release/
      alpha_release_checklist.md   -- Step-by-step PyPI publish checklist
      release_readiness.json       -- Says ready=true, wheel+sdist listed
  roadmaps/
    README.md                      -- Overview of all 6 paths
    01_solo_dev_tool.md            -- CHOSEN: bootstrap to PyPI
    02_dev_tool_saas.md
    03_enterprise_platform.md
    04_open_source_support.md
    05_grammarly_for_code.md
    06_ai_transformation_engine.md
  scripts/
    launch_metrics.py
    feedback_prioritizer.py
    sales_pipeline_metrics.py
    release_prep.py
  tests/
    test_code_normalize_pro.py
    test_feedback_prioritizer.py
    test_launch_metrics.py
    test_release_prep.py
    test_sales_pipeline_metrics.py
  site/
    index.html                     -- Static landing page
    styles.css
  .github/
    workflows/ci.yml               -- CI: install, smoke check, pytest, build
    ISSUE_TEMPLATE/
    pull_request_template.md
  files/
    cache_sandbox/                 -- Test fixtures (a.py, b.py)
    smoke_case.py
  EXECUTION_PLAN.md                -- 7-day launch plan (all tasks pending)
  VERIFY.md                        -- Verification runbook
  MISSINGMORE.txt                  -- Gap tracking
  QUICK_REFERENCE.md               -- Command cheat sheet
  CHANGELOG.md                     -- Stub (unreleased only)

Dependencies

Core: zero. Python 3.10+ only.

Optional:

  • tqdm -- progress bars
  • Syntax checkers (only needed with --check): Python: built-in (py_compile) | JS: node | TS: tsc | Go: gofmt Rust: rustc | C: gcc | C++: g++ | Java: javac

Dev/test: pytest (see requirements.txt)


Known Issues (fix before PyPI launch)

Critical -- blocks shipping:

  1. No pyproject.toml -- CI runs python -m build which will fail without it. The code_normalizer_pro.egg-info/ dir shows packaging was attempted but no config file exists in the tree. Create pyproject.toml with src layout and console_scripts entry point before running Day 1 tasks.

  2. code_normalizer_pro/cli.py has a broken import: from code_normalize_pro import main After pip install, Python looks for a module named code_normalize_pro in site-packages, not in src/. Without a proper src layout in pyproject.toml, the installed CLI command will fail on launch.

Code bugs worth fixing:

  1. Cache default is on in __init__ but --cache flag implies opt-in and --no-cache implies opt-out. The flags and the default contradict each other. Pick one direction and make the help text match.

  2. --parallel --in-place silently disables backups. process_file_worker passes create_backup=False but backup logic only lives inside process_file. Users running parallel mode have no backups. Either warn loudly or fix it.

  3. walk_and_process and process_file both increment total_files for the same files. Summary stats will show inflated counts.

  4. .normalize-cache.json lands in CWD, not the target directory. Running the tool against three different projects from the same shell session corrupts the cache. Pass root / CACHE_FILE to CacheManager in walk_and_process.

  5. --dry-run always exits 0 even when it finds files needing normalization. CI pipelines need a non-zero exit to catch violations. Add --fail-on-changes or make dry-run exit 1 when changes are detected.

Cleanup:

  1. code_normalize_pro.py at root -- stale copy. Real file is src/. Delete it.
  2. roadmaps/New Text Document.txt -- empty temp file. Delete it.
  3. roadmaps/talking about code.txt -- saved AI chat session. Delete or move to docs/.
  4. All README_20260220_*.md.bak files throughout the tree -- ReadmeForge backups.
  5. config/settings.py is a clean env-var loader but nothing imports it. Either wire it into code_normalize_pro.py or remove it.
  6. README_PRO.md at root duplicates docs/README.md. Consolidate.
  7. restore_report.json and smoke_report.json at root -- generated artifacts, add to .gitignore.
  8. PROJECT_STATUS.md says roadmap docs are "coming soon" -- all 6 exist. Stale.

Launch Status (Path 1 - Solo Dev Tool)

EXECUTION_PLAN.md has a 7-day checklist. As of 2026-03-10, nothing started.

Before Day 1 tasks will work, pyproject.toml needs to exist (see issue #1 above).

Day 1 after pyproject.toml is in place:

Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1
python -m pytest -q
pip install -e .
code-normalizer-pro --help

Full release steps: see docs/release/alpha_release_checklist.md


CI

.github/workflows/ci.yml runs on push to main/master and on PRs:

  • Python 3.11
  • pip install from requirements.txt
  • CLI smoke check (main.py and src/code_normalize_pro.py --help)
  • pytest -q
  • python -m build (sdist + wheel)

Note: python -m build requires pyproject.toml. CI will fail until that exists.


Version History

Version Date Changes
v3.0 2026-02-09 Parallel processing, SHA256 caching, pre-commit hooks, multi-language syntax, interactive mode
v2.0 2026-02-09 Dry-run, in-place editing, backups, tqdm, detailed stats
v1.0 -- Basic encoding fix, CRLF, whitespace

Package version: 3.0.1 (set in code_normalizer_pro/__init__.py)


Developer: MR (Michael Rawls Jr.) -- Houston, TX -- GitHub: MRJR0101

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_normalizer_pro-3.0.1.tar.gz (22.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_normalizer_pro-3.0.1-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file code_normalizer_pro-3.0.1.tar.gz.

File metadata

  • Download URL: code_normalizer_pro-3.0.1.tar.gz
  • Upload date:
  • Size: 22.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for code_normalizer_pro-3.0.1.tar.gz
Algorithm Hash digest
SHA256 bd69ffff4c3b9792ca50bf57c375646a45473e90fa335e738e68546e60843feb
MD5 1be1a473c22c5ac7786a88069858ca9d
BLAKE2b-256 a10903d6f423c357825de3de3093fc463118955df62968d6ad3ccb7610f526db

See more details on using hashes here.

File details

Details for the file code_normalizer_pro-3.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for code_normalizer_pro-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b5d6a67c804c395c8ee6283d12c441521cd0609c4d927d3279fb705c7cea873a
MD5 54718904a6854d3a6469e00d8791aeae
BLAKE2b-256 693026f9f2e5aff58dfa89a13e51d4e6d770239f58aa4dcf027511dee67241d9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page