Production-grade code normalization tool for encoding, newlines, and whitespace hygiene.
Project description
CODE - Code Normalization Tool
Python CLI that cleans up source code encoding, line endings, and whitespace across entire codebases -- with parallel processing, SHA256 caching, and pre-commit hook support.
- Location:
C:\Dev\PROJECTS\CODE - Status: v3.0 code complete. Package stub ready. No pyproject.toml = blocked from PyPI.
- Updated: 2026-03-10
What It Does
Run it against any directory and it will:
- Detect and convert file encoding to UTF-8 (handles utf-8, utf-8-sig, utf-16, utf-16-le, utf-16-be, windows-1252, latin-1, iso-8859-1)
- Fix line endings -- CRLF to LF
- Strip trailing whitespace from every line
- Ensure a single newline at end of file
- Optionally validate syntax for Python, JS, TS, Go, Rust, C, C++, Java
Files already clean are skipped. SHA256 caching means repeat runs on unchanged files are near-instant. Multi-core parallel mode handles large codebases at 80-200 files/sec.
Quick Start
Set-Location C:\Dev\PROJECTS\CODE
# See what would change without touching anything
python main.py C:\path\to\project --dry-run
# Normalize everything in-place using all CPU cores
python main.py C:\path\to\project --parallel --in-place
# Normalize only Python and JavaScript files
python main.py C:\path\to\project -e .py -e .js --in-place
# Review and approve each file before it's written
python main.py C:\path\to\project --interactive
# Run syntax validation after normalizing
python main.py C:\path\to\project --in-place --check
# Install a pre-commit hook into a git repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook
main.py at root is a thin wrapper that delegates to src/code_normalize_pro.py.
Call either one -- same result.
Pre-Commit Hook
Checks only staged files before each commit. Blocks commit if any need normalization and prints the fix command.
# One-time install per repo
cd C:\your-repo
python C:\Dev\PROJECTS\CODE\main.py --install-hook
# Commit as normal -- hook fires automatically
git commit -m "your message"
# Skip hook for one commit
git commit --no-verify -m "your message"
Performance
| Files | Sequential | Parallel 4-core | Speedup |
|---|---|---|---|
| 100 | 3.2s | 1.1s | 2.9x |
| 500 | 16.8s | 4.3s | 3.9x |
| 1000 | 33.5s | 7.1s | 4.7x |
8 cores: 150-200 files/sec. SHA256 cache on unchanged files: 500-1000 files/sec.
Workers default to CPU count. Override with --workers N.
Testing
Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1
python -m pytest -q
python main.py --help
Test files in tests/ cover the main tool plus all four launch/sales scripts.
All 5 features tested on 2026-02-09 (see docs/TEST_REPORT.md). Manual confirmation
of interactive mode still pending.
Project Layout
CODE/
main.py -- Root entrypoint. Delegates to src/code_normalize_pro.py
src/
code_normalize_pro.py -- v3.0 Pro. 917 lines. The active tool.
code_normalize_v2.py -- v2.0. Kept for reference.
code_normalizer_pro/ -- PyPI package stub
__init__.py -- Exposes __version__ = "3.0.1"
cli.py -- Console entry point (calls src/code_normalize_pro.py)
README.md
config/
settings.py -- Env-var settings loader (not wired up yet)
docs/
README.md -- Full feature reference docs
TEST_REPORT.md -- Test results from 2026-02-09
ARCHITECTURE.md -- Stub
launch/ -- Outreach templates, user tracking CSV, metrics JSON
sales/ -- Pricing, pipeline CSV, customer offer template
release/
alpha_release_checklist.md -- Step-by-step PyPI publish checklist
release_readiness.json -- Says ready=true, wheel+sdist listed
roadmaps/
README.md -- Overview of all 6 paths
01_solo_dev_tool.md -- CHOSEN: bootstrap to PyPI
02_dev_tool_saas.md
03_enterprise_platform.md
04_open_source_support.md
05_grammarly_for_code.md
06_ai_transformation_engine.md
scripts/
launch_metrics.py
feedback_prioritizer.py
sales_pipeline_metrics.py
release_prep.py
tests/
test_code_normalize_pro.py
test_feedback_prioritizer.py
test_launch_metrics.py
test_release_prep.py
test_sales_pipeline_metrics.py
site/
index.html -- Static landing page
styles.css
.github/
workflows/ci.yml -- CI: install, smoke check, pytest, build
ISSUE_TEMPLATE/
pull_request_template.md
files/
cache_sandbox/ -- Test fixtures (a.py, b.py)
smoke_case.py
EXECUTION_PLAN.md -- 7-day launch plan (all tasks pending)
VERIFY.md -- Verification runbook
MISSINGMORE.txt -- Gap tracking
QUICK_REFERENCE.md -- Command cheat sheet
CHANGELOG.md -- Stub (unreleased only)
Dependencies
Core: zero. Python 3.10+ only.
Optional:
tqdm-- progress bars- Syntax checkers (only needed with
--check): Python: built-in (py_compile) | JS: node | TS: tsc | Go: gofmt Rust: rustc | C: gcc | C++: g++ | Java: javac
Dev/test: pytest (see requirements.txt)
Known Issues (fix before PyPI launch)
Critical -- blocks shipping:
-
No
pyproject.toml-- CI runspython -m buildwhich will fail without it. Thecode_normalizer_pro.egg-info/dir shows packaging was attempted but no config file exists in the tree. Createpyproject.tomlwith src layout and console_scripts entry point before running Day 1 tasks. -
code_normalizer_pro/cli.pyhas a broken import:from code_normalize_pro import mainAfterpip install, Python looks for a module namedcode_normalize_proin site-packages, not insrc/. Without a proper src layout in pyproject.toml, the installed CLI command will fail on launch.
Code bugs worth fixing:
-
Cache default is on in
__init__but--cacheflag implies opt-in and--no-cacheimplies opt-out. The flags and the default contradict each other. Pick one direction and make the help text match. -
--parallel --in-placesilently disables backups.process_file_workerpassescreate_backup=Falsebut backup logic only lives insideprocess_file. Users running parallel mode have no backups. Either warn loudly or fix it. -
walk_and_processandprocess_fileboth incrementtotal_filesfor the same files. Summary stats will show inflated counts. -
.normalize-cache.jsonlands in CWD, not the target directory. Running the tool against three different projects from the same shell session corrupts the cache. Passroot / CACHE_FILEto CacheManager inwalk_and_process. -
--dry-runalways exits 0 even when it finds files needing normalization. CI pipelines need a non-zero exit to catch violations. Add--fail-on-changesor make dry-run exit 1 when changes are detected.
Cleanup:
code_normalize_pro.pyat root -- stale copy. Real file issrc/. Delete it.roadmaps/New Text Document.txt-- empty temp file. Delete it.roadmaps/talking about code.txt-- saved AI chat session. Delete or move to docs/.- All
README_20260220_*.md.bakfiles throughout the tree -- ReadmeForge backups. config/settings.pyis a clean env-var loader but nothing imports it. Either wire it intocode_normalize_pro.pyor remove it.README_PRO.mdat root duplicatesdocs/README.md. Consolidate.restore_report.jsonandsmoke_report.jsonat root -- generated artifacts, add to.gitignore.PROJECT_STATUS.mdsays roadmap docs are "coming soon" -- all 6 exist. Stale.
Launch Status (Path 1 - Solo Dev Tool)
EXECUTION_PLAN.md has a 7-day checklist. As of 2026-03-10, nothing started.
Before Day 1 tasks will work, pyproject.toml needs to exist (see issue #1 above).
Day 1 after pyproject.toml is in place:
Set-Location C:\Dev\PROJECTS\CODE
.\.venv\Scripts\Activate.ps1
python -m pytest -q
pip install -e .
code-normalizer-pro --help
Full release steps: see docs/release/alpha_release_checklist.md
CI
.github/workflows/ci.yml runs on push to main/master and on PRs:
- Python 3.11
- pip install from requirements.txt
- CLI smoke check (main.py and src/code_normalize_pro.py --help)
- pytest -q
- python -m build (sdist + wheel)
Note: python -m build requires pyproject.toml. CI will fail until that exists.
Version History
| Version | Date | Changes |
|---|---|---|
| v3.0 | 2026-02-09 | Parallel processing, SHA256 caching, pre-commit hooks, multi-language syntax, interactive mode |
| v2.0 | 2026-02-09 | Dry-run, in-place editing, backups, tqdm, detailed stats |
| v1.0 | -- | Basic encoding fix, CRLF, whitespace |
Package version: 3.0.1 (set in code_normalizer_pro/__init__.py)
Developer: MR (Michael Rawls Jr.) -- Houston, TX -- GitHub: MRJR0101
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code_normalizer_pro-3.0.1.tar.gz.
File metadata
- Download URL: code_normalizer_pro-3.0.1.tar.gz
- Upload date:
- Size: 22.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd69ffff4c3b9792ca50bf57c375646a45473e90fa335e738e68546e60843feb
|
|
| MD5 |
1be1a473c22c5ac7786a88069858ca9d
|
|
| BLAKE2b-256 |
a10903d6f423c357825de3de3093fc463118955df62968d6ad3ccb7610f526db
|
File details
Details for the file code_normalizer_pro-3.0.1-py3-none-any.whl.
File metadata
- Download URL: code_normalizer_pro-3.0.1-py3-none-any.whl
- Upload date:
- Size: 16.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5d6a67c804c395c8ee6283d12c441521cd0609c4d927d3279fb705c7cea873a
|
|
| MD5 |
54718904a6854d3a6469e00d8791aeae
|
|
| BLAKE2b-256 |
693026f9f2e5aff58dfa89a13e51d4e6d770239f58aa4dcf027511dee67241d9
|