Ultra-lightweight pure Python package to check if a file is binary or text.
Project description
BinaryOrNot
Python library and CLI tool to check if a file is binary or text. Zero dependencies.
from binaryornot.check import is_binary
is_binary("image.png") # True
is_binary("README.md") # False
is_binary("data.sqlite") # True
is_binary("report.csv") # False
$ binaryornot image.png
True
Install
pip install binaryornot
Why not just check for null bytes?
That's the first thing everyone tries. It works until it doesn't:
- A UTF-16 text file is full of null bytes. Your tool thinks it's binary and corrupts it.
- A Big5 or GB2312 text file has high-ASCII bytes everywhere. Looks binary by byte ratios alone.
- A font file (.woff, .eot) is clearly binary but might not have null bytes in the first chunk.
BinaryOrNot reads the first 128 bytes and runs them through a trained decision tree that considers byte ratios, Shannon entropy, encoding validity, BOM detection, and more. It handles all the edge cases above correctly, with zero dependencies.
Tested against 37 text encodings and 49 binary formats, verified by parametrized tests driven from coverage CSVs.
API
One function:
from binaryornot.check import is_binary
is_binary(filename) # returns True or False
There's also is_binary_string() if you already have bytes:
from binaryornot.helpers import is_binary_string
is_binary_string(b"\x00\x01\x02") # True
is_binary_string(b"hello world") # False
Full documentation covers the detection algorithm in detail.
Credits
Created by Audrey Roy Greenfeld.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file binaryornot-0.5.0.tar.gz.
File metadata
- Download URL: binaryornot-0.5.0.tar.gz
- Upload date:
- Size: 428.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcacb1343219da5fbbb7828a46b946768c6df07b65453195954cbbecf14c1c83
|
|
| MD5 |
2a76bf08a1af1d482a69ff1bd55298df
|
|
| BLAKE2b-256 |
617e8a41b27448bfcc8138f8aec3ac2a467edf22a1d85bdbbd7fd0a130372fc4
|
Provenance
The following attestation bundles were made for binaryornot-0.5.0.tar.gz:
Publisher:
publish.yml on binaryornot/binaryornot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binaryornot-0.5.0.tar.gz -
Subject digest:
dcacb1343219da5fbbb7828a46b946768c6df07b65453195954cbbecf14c1c83 - Sigstore transparency entry: 1056836159
- Sigstore integration time:
-
Permalink:
binaryornot/binaryornot@04b33b3c3c3b9fb2524f60ded37f2cf5050c15dc -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/binaryornot
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@04b33b3c3c3b9fb2524f60ded37f2cf5050c15dc -
Trigger Event:
push
-
Statement type:
File details
Details for the file binaryornot-0.5.0-py3-none-any.whl.
File metadata
- Download URL: binaryornot-0.5.0-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a969a893feb93508c14ea64a80354ba4a3164ccd2d3b5122cd438fab6965134
|
|
| MD5 |
d7e9047c2727e4b1578ab024714bb52d
|
|
| BLAKE2b-256 |
771392fbe1fce5eecc0c926ec94a9a8904e9f9a74c286887205fbc071f4ea349
|
Provenance
The following attestation bundles were made for binaryornot-0.5.0-py3-none-any.whl:
Publisher:
publish.yml on binaryornot/binaryornot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binaryornot-0.5.0-py3-none-any.whl -
Subject digest:
2a969a893feb93508c14ea64a80354ba4a3164ccd2d3b5122cd438fab6965134 - Sigstore transparency entry: 1056836188
- Sigstore integration time:
-
Permalink:
binaryornot/binaryornot@04b33b3c3c3b9fb2524f60ded37f2cf5050c15dc -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/binaryornot
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@04b33b3c3c3b9fb2524f60ded37f2cf5050c15dc -
Trigger Event:
push
-
Statement type: