Scan, fix, and verify file encoding issues. Mojibake, BOM, CRLF, null bytes — fixed in one command.
Project description
encoding-doctor
Scan, fix, and verify file encoding issues across your project in one command.
Fixes mojibake, BOM, CRLF line endings, null bytes, and non-UTF-8 encoding — automatically detected and repaired, with backups created before every change.
Built from real encoding bugs found in production Python projects on Windows.
Install
pip install encoding-doctor
Usage
# Step 1 — scan first, always
enc-doctor scan ./my_project
# Step 2 — preview changes without writing
enc-doctor fix ./my_project --dry-run
# Step 3 — fix (backups created automatically as .bak)
enc-doctor fix ./my_project
# Step 4 — verify everything is clean
enc-doctor verify ./my_project
What it fixes
| Problem | Description |
|---|---|
| Mojibake | UTF-8 bytes mis-read as cp1252 and saved as garbage |
| BOM | \xef\xbb\xbf prefix added by Notepad/Excel that breaks parsers |
| CRLF | Windows \r\n mixed with Unix \n — causes Git diff noise |
| Null bytes | Binary corruption from FTP or terminal copy-paste |
| Non-UTF-8 | Detected and flagged for manual conversion |
Warning
encoding-doctor modifies files in-place.
- Always run
scanfirst and review the report before runningfix.- Backups are created automatically as
.bakfiles.- Run on a Git-tracked project so you can always revert with
git checkout .- Do not run
fixon production files without testing first.verifyafter every fix before committing.
Options
enc-doctor scan <path> [--all] # --all shows clean files too
enc-doctor fix <path> [--dry-run] # --dry-run previews without writing
enc-doctor verify <path>
enc-doctor restore <file> # restore single file from .bak
Run tests
pip install pytest
pytest tests/ -v
License
MIT © Stateflow Labs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file encoding_doctor-0.2.0.tar.gz.
File metadata
- Download URL: encoding_doctor-0.2.0.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14fd7016d0d7d0004947b50e77c8d5c4131fe9a5ca05c76b43f599a8beeac3fe
|
|
| MD5 |
0175bff34d4a8bea3ac5a9faf77ee6e0
|
|
| BLAKE2b-256 |
d4b68f5809f1da2cb836b4d21ef6e81c052fcaa5e9da141ef853416445b07d24
|
File details
Details for the file encoding_doctor-0.2.0-py3-none-any.whl.
File metadata
- Download URL: encoding_doctor-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1d394de138318a855e22afaa46f89e695d2f2315a3af7ced68ec571647af473
|
|
| MD5 |
93dd4013768f7ed3c9c9c6655f9d1091
|
|
| BLAKE2b-256 |
b2dc0e5ef46321e6d7578c13f905d095d1a16eeda727d4553dc0b73b2bd3b102
|