Skip to main content

Scan, fix, and verify file encoding issues. Mojibake, BOM, CRLF, null bytes — fixed in one command.

Project description

encoding-doctor

Scan, fix, and verify file encoding issues across your project in one command.

Fixes mojibake, BOM, CRLF line endings, null bytes, and non-UTF-8 encoding — automatically detected and repaired, with backups created before every change.

Built from real encoding bugs found in production Python projects on Windows.


Install

pip install encoding-doctor

Usage

# Step 1 — scan first, always
enc-doctor scan ./my_project

# Step 2 — preview changes without writing
enc-doctor fix ./my_project --dry-run

# Step 3 — fix (backups created automatically as .bak)
enc-doctor fix ./my_project

# Step 4 — verify everything is clean
enc-doctor verify ./my_project

What it fixes

Problem Description
Mojibake UTF-8 bytes mis-read as cp1252 and saved as garbage
BOM \xef\xbb\xbf prefix added by Notepad/Excel that breaks parsers
CRLF Windows \r\n mixed with Unix \n — causes Git diff noise
Null bytes Binary corruption from FTP or terminal copy-paste
Non-UTF-8 Detected and flagged for manual conversion

Warning

encoding-doctor modifies files in-place.

  • Always run scan first and review the report before running fix.
  • Backups are created automatically as .bak files.
  • Run on a Git-tracked project so you can always revert with git checkout .
  • Do not run fix on production files without testing first.
  • verify after every fix before committing.

Options

enc-doctor scan   <path> [--all]       # --all shows clean files too
enc-doctor fix    <path> [--dry-run]   # --dry-run previews without writing
enc-doctor verify <path>
enc-doctor restore <file>              # restore single file from .bak

Run tests

pip install pytest
pytest tests/ -v

License

MIT © Stateflow Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

encoding_doctor-0.2.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

encoding_doctor-0.2.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file encoding_doctor-0.2.0.tar.gz.

File metadata

  • Download URL: encoding_doctor-0.2.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for encoding_doctor-0.2.0.tar.gz
Algorithm Hash digest
SHA256 14fd7016d0d7d0004947b50e77c8d5c4131fe9a5ca05c76b43f599a8beeac3fe
MD5 0175bff34d4a8bea3ac5a9faf77ee6e0
BLAKE2b-256 d4b68f5809f1da2cb836b4d21ef6e81c052fcaa5e9da141ef853416445b07d24

See more details on using hashes here.

File details

Details for the file encoding_doctor-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for encoding_doctor-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a1d394de138318a855e22afaa46f89e695d2f2315a3af7ced68ec571647af473
MD5 93dd4013768f7ed3c9c9c6655f9d1091
BLAKE2b-256 b2dc0e5ef46321e6d7578c13f905d095d1a16eeda727d4553dc0b73b2bd3b102

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page