Skip to main content

A Python package for detecting hidden Unicode and ASCII characters.

Project description

ByteSleuth_Banner

🕵️‍♂️ ByteSleuth — The Ghost Hunter for Hidden Characters

"Elementary, my dear dev. The ghosts of hidden characters won't escape this audit!"
CharlockHolmes, the detective inside ByteSleuth

ByteSleuth is a powerful Unicode & ASCII character scanner designed to detect obfuscation, invisible threats, and suspicious bytes lurking in text or code. Whether you're hunting down ghost characters or analyzing unexpected encoding issues, ByteSleuth ensures a clean and transparent result.


🚀 Key Features

✅ Detects ASCII control characters (e.g., NUL, BEL, ESC)
✅ Flags Unicode invisibles and directional controls (e.g., U+200B, U+202E)
✅ Optionally sanitizes input by removing hidden/malicious characters
✅ Works seamlessly with files and directories
✅ Supports logging for audit trails
✅ Can be embedded in existing workflows


🔧 CLI Usage

python byte_sleuth.py <target> [-m MODE] [-s] [-l LOG_FILE]

CLI Arguments

Argument Description
target File or directory to scan
-m, --mode Scan only ASCII, only Unicode, or both (all)
-s, --sanitize Automatically remove suspicious characters
-l, --log Log file to write results (default: scanner.log)

CLI Example

python byte_sleuth.py suspicious.txt -m all -s

Scans suspicious.txt for both ASCII & Unicode anomalies, removes them, and logs results.


📦 Using ByteSleuth in Your Python Projects

Since ByteSleuth is modular, you can easily integrate it into any existing application.

Installing ByteSleuth

Once published to PyPI, you can install it via:

pip install byte-sleuth

Basic Usage in Python

from byte_sleuth import CharacterScanner

scanner = CharacterScanner(sanitize=True)
findings = scanner.scan_file("example.txt", mode="all")

for cp, name, char in findings:
    print(f"⚠️ Suspicious Character: {name} (U+{cp:04X}) → {repr(char)}")

This scans "example.txt" for hidden characters and removes them if needed.


🔁 Embedding ByteSleuth in Workflows

ByteSleuth can be used beyond basic scans, making it a perfect fit for automation and security audits:

  • 🛠️ Pre-commit hook — Block commits containing obfuscated characters.
  • 🔍 CI/CD pipelines — Ensure clean and readable source code before deployment.
  • 📜 Log analysis — Detect and clean malformed logs with invisible characters.

Example: Pre-commit Hook

# .pre-commit-config.yaml
- repo: local
  hooks:
    - id: byte-sleuth-scan
      name: ByteSleuth Unicode & ASCII Scanner
      entry: python byte_sleuth.py src/ -m all -s
      language: system
      pass_filenames: false

🧠 Why Use ByteSleuth?

Some characters are invisible but dangerous—causing confusion in source code, configs, or documents.
Common attack vectors include:

🔹 Zero-width spaces used for code obfuscation
🔹 Bidirectional override characters affecting text visibility
🔹 Hidden ASCII control codes that alter behavior unexpectedly
🔹 Formatting trickery affecting debugging & diffs

ByteSleuth gives you a detective's magnifying glass to expose them all. 🔍


🚀 Roadmap

✔️ Expand sanitization methods
✔️ Improve CLI interactivity
✔️ Output JSON reports
🟡 VSCode Extension (planned)
🟡 Interactive CLI with rich or curses UI (planned)


🕵️‍♂️ Honorary Agent: CharlockHolmes

When Unicode hides... he seeks.
When ASCII misbehaves... he strikes.
Because no character escapes... the ByteSleuth.


📄 License

MIT — Feel free to sleuth away!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bytesleuth-1.0.0-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file bytesleuth-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: bytesleuth-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for bytesleuth-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9d16ab2078a817179856c71f46f426ea4da9c600d5dcc2ac3ce52acda4d8aa78
MD5 0c681110922ff67cdf4d7fb1afba1d06
BLAKE2b-256 593e4181185bac489f465fa4e110690adb82297e44d6acfe8eb8eee98804ab6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page