A tool for analyzing Python code
Project description
biston
A structural clone detector for Python code. Written in Rust.
It parses Python files with tree-sitter, normalizes the AST, and finds functions that are structurally similar to each other.
Install
uv add biston
Or build from source:
cargo build --release
Usage
biston <COMMAND>
Commands
biston scan
Scan a directory for code clones.
Usage: biston scan [OPTIONS] [PATH]
Arguments:
[PATH] Directory to scan [default: .]
Options:
--format <FORMAT> Output format [possible values: text, json, sarif]
--min-lines <MIN_LINES> Minimum function length in lines
--threshold <THRESHOLD> Similarity threshold (0.0 - 1.0)
--config <CONFIG> Config file directory (looks for biston.toml or pyproject.toml)
--tests-only Restrict the scan to Python test files (overrides include/exclude)
--suggest Generate abstraction suggestions for similar pairs
--files <FILE> Only emit pairs involving this file (repeat for multiple)
--files-from <PATH> Read focus file list from PATH, or `-` for stdin
-h, --help Print help
biston stats
Show statistics about scan findings.
Usage: biston stats [OPTIONS] [PATH]
Arguments:
[PATH] Directory to scan [default: .]
Options:
--format <FORMAT> Output format [possible values: text, json, sarif]
--min-lines <MIN_LINES> Minimum function length in lines
--threshold <THRESHOLD> Similarity threshold (0.0 - 1.0)
--config <CONFIG> Config file directory (looks for biston.toml or pyproject.toml)
--tests-only Restrict the scan to Python test files (overrides include/exclude)
--files <FILE> Only emit pairs involving this file (repeat for multiple)
--files-from <PATH> Read focus file list from PATH, or `-` for stdin
-h, --help Print help
Scanning tests only
Test suites often accumulate duplication (near-identical cases that could be @pytest.mark.parametrize, copy-pasted arrange/act/assert blocks). By default biston excludes test files so production-code findings stay focused. Pass --tests-only to flip the scope and scan only test files:
biston scan --tests-only
biston stats --tests-only
The flag replaces include with common Python test patterns (**/test_*.py, **/*_test.py, **/conftest.py, tests/**/*.py) and clears exclude. Other knobs (min_lines, threshold, normalization) are left untouched — tune them separately in biston.toml if you want different defaults for a test run.
Commit-hook use (focus files)
--files / --files-from let you restrict reporting to pairs involving a
specific set of files, while still scanning the whole repo so cross-file
clones between those files and the rest of the tree are detected.
For a pre-commit hook, pipe git diff --name-only through --files-from -:
git diff --name-only --diff-filter=ACM -- '*.py' \
| biston scan --files-from - .
An empty list (no Python files changed) correctly emits no pairs. Prefer
--files-from over --files $(git diff --name-only) — the latter expands to
an empty flag when nothing changed, which reverts to a full-repo scan.
Configuration
Settings can go in biston.toml or under [tool.biston] in pyproject.toml. If both files exist, biston.toml takes priority. CLI flags override config file settings.
[scan]
| Setting | Default | Description |
|---|---|---|
min_lines |
10 |
Minimum function length in lines |
threshold |
0.7 |
Similarity threshold (0.0–1.0) |
exclude |
["tests/**", "**/conftest.py", "migrations/**"] |
File patterns to exclude |
include |
["**/*.py"] |
File patterns to include |
[normalization]
| Setting | Default | Description |
|---|---|---|
anonymize_locals |
true |
Replace local variable names |
anonymize_literals |
false |
Replace literal values |
strip_decorators |
true |
Remove decorators from AST |
strip_type_annotations |
true |
Remove type hints |
sort_commutative |
false |
Sort commutative operations |
[output]
| Setting | Default | Description |
|---|---|---|
format |
"text" |
Output format (text, json, or sarif) |
group_overlapping |
true |
Group overlapping clones |
max_results |
50 |
Maximum number of results |
show_source |
true |
Display source code in output |
context_lines |
3 |
Number of context lines around clones |
[suggest]
| Setting | Default | Description |
|---|---|---|
enabled |
false |
Enable suggestion generation |
min_quality |
0.6 |
Minimum template coverage score (0.0–1.0) |
max_holes |
5 |
Maximum holes before suppressing |
render_python |
true |
Render templates as Python source |
[suppress]
| Setting | Default | Description |
|---|---|---|
files |
[] |
File glob patterns to suppress entirely |
Example biston.toml
[scan]
min_lines = 15
threshold = 0.8
exclude = ["vendor/"]
include = ["src/**/*.py"]
[normalization]
anonymize_locals = false
anonymize_literals = true
[output]
format = "json"
max_results = 100
[suggest]
enabled = true
min_quality = 0.8
Inline suppression
You can also suppress findings with Python comments:
# biston: ignore-file— suppress the entire file (must appear in the first 5 lines)# biston: ignore— suppress a single function (place in the function body or on the preceding line)
Documentation
Full docs at https://mojzis.github.io/biston/.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file biston-0.5.0.tar.gz.
File metadata
- Download URL: biston-0.5.0.tar.gz
- Upload date:
- Size: 882.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
289b1a9a6d8811b04fb049ef01cc4834f2c37434e8235bad01892012d01b79ea
|
|
| MD5 |
a7bfa0056641bbe63ac941a6fc7a1945
|
|
| BLAKE2b-256 |
e135ac5828c2744716da22aa52af700cd9521781b914045f264a9ab53654108d
|
Provenance
The following attestation bundles were made for biston-0.5.0.tar.gz:
Publisher:
release.yml on mojzis/biston
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biston-0.5.0.tar.gz -
Subject digest:
289b1a9a6d8811b04fb049ef01cc4834f2c37434e8235bad01892012d01b79ea - Sigstore transparency entry: 1321708282
- Sigstore integration time:
-
Permalink:
mojzis/biston@10eca006e0ceaca73316c07cb30958bf725bf152 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mojzis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@10eca006e0ceaca73316c07cb30958bf725bf152 -
Trigger Event:
push
-
Statement type:
File details
Details for the file biston-0.5.0-py3-none-win_amd64.whl.
File metadata
- Download URL: biston-0.5.0-py3-none-win_amd64.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2cf9ebf029958eb1e78215df301cc6d5b8885e2d6b80f744ba47c9c0285c289
|
|
| MD5 |
82acc4f799b3b37c553760cc8ef175ab
|
|
| BLAKE2b-256 |
e2620da884c32b256b56ee61d91ae55de4082dff5841984bf6644431eed9b1f1
|
Provenance
The following attestation bundles were made for biston-0.5.0-py3-none-win_amd64.whl:
Publisher:
release.yml on mojzis/biston
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biston-0.5.0-py3-none-win_amd64.whl -
Subject digest:
e2cf9ebf029958eb1e78215df301cc6d5b8885e2d6b80f744ba47c9c0285c289 - Sigstore transparency entry: 1321708331
- Sigstore integration time:
-
Permalink:
mojzis/biston@10eca006e0ceaca73316c07cb30958bf725bf152 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mojzis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@10eca006e0ceaca73316c07cb30958bf725bf152 -
Trigger Event:
push
-
Statement type:
File details
Details for the file biston-0.5.0-py3-none-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: biston-0.5.0-py3-none-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: Python 3, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6836af938703d5a1248204729432f9c42115eb4123ae838a405ac8e69e0897d8
|
|
| MD5 |
80b2190e15000e12d6caf6480223ae71
|
|
| BLAKE2b-256 |
aeaec5b9bcbdbd7eafa4001169275b4d4f2f1f96ef12bc35140c95a5dc9b3a2d
|
Provenance
The following attestation bundles were made for biston-0.5.0-py3-none-manylinux_2_28_x86_64.whl:
Publisher:
release.yml on mojzis/biston
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biston-0.5.0-py3-none-manylinux_2_28_x86_64.whl -
Subject digest:
6836af938703d5a1248204729432f9c42115eb4123ae838a405ac8e69e0897d8 - Sigstore transparency entry: 1321708465
- Sigstore integration time:
-
Permalink:
mojzis/biston@10eca006e0ceaca73316c07cb30958bf725bf152 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mojzis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@10eca006e0ceaca73316c07cb30958bf725bf152 -
Trigger Event:
push
-
Statement type:
File details
Details for the file biston-0.5.0-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: biston-0.5.0-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb63fbca3d059c8bb750d79de016e1d606e577b83b5c513931679323c995756c
|
|
| MD5 |
cbc8225ed112526020f3d79b46514822
|
|
| BLAKE2b-256 |
4497e255f980ba62ff27cd2d212a66562eb20d804c3d3a5779aadfc583dab59a
|
Provenance
The following attestation bundles were made for biston-0.5.0-py3-none-macosx_11_0_arm64.whl:
Publisher:
release.yml on mojzis/biston
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biston-0.5.0-py3-none-macosx_11_0_arm64.whl -
Subject digest:
fb63fbca3d059c8bb750d79de016e1d606e577b83b5c513931679323c995756c - Sigstore transparency entry: 1321708403
- Sigstore integration time:
-
Permalink:
mojzis/biston@10eca006e0ceaca73316c07cb30958bf725bf152 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mojzis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@10eca006e0ceaca73316c07cb30958bf725bf152 -
Trigger Event:
push
-
Statement type: