Find functions that work on the same resource through different code paths — the architectural-duplication detector that token-similarity tools miss.
Project description
parallax
Find code that does the same logical job through different paths.
Token-similarity tools (jscpd, PMD CPD, pylint duplicate-code) detect copy-paste. parallax detects something different: two pieces of code that touch the same set of resources, regardless of how the code is written. Different filters, different return shapes, even different languages.
Model
A unit of code (function, method, file, module, microservice, ...) touches a set of resources (database tables, HTTP endpoints, Redis keys, env vars, file paths, ...). Units sharing the same resource set are clustered as duplication candidates.
Both unit detection and resource detection are pluggable per extractor.
Built-in extractors
| Name | Unit | Resource |
|---|---|---|
sqlalchemy |
Python function/method | SQLAlchemy ORM model classes |
django |
Python function/method | Django ORM model classes |
http-urls |
any text file | HTTP URL paths |
env-vars |
any text file | Environment variable names |
redis-keys |
any text file | Redis key namespaces |
Installation
pip install parallax-scan
Usage
parallax scan path/to/repo
parallax scan path/to/repo --extractor sqlalchemy
parallax scan path/to/repo -e sqlalchemy -e http-urls
parallax scan path/to/repo --min-resources 3 --top 20
parallax scan path/to/repo --cross-file-only
parallax scan path/to/repo --format html -o report.html
parallax scan path/to/repo --format sarif -o parallax.sarif
Configuration
Drop .parallax.toml at your repo root:
[scan]
min_resources = 3
min_cluster_size = 2
[ci]
max_cluster_size = 5
[[ignore]]
resources = ["User", "Place"]
reason = "Generic"
In CI:
parallax scan . --ci
parallax scan . --format sarif -o parallax.sarif
--ci exits non-zero only when a cluster meets ci.max_cluster_size. Without it, any cluster is non-zero.
Comparison
| Tool | Catches | Doesn't catch |
|---|---|---|
| jscpd, PMD CPD, pylint duplicate-code | Token-similar copy-paste | Code with different surface shape |
| Sourcegraph | Manual code search | Automatic detection |
pydeps, dependency-cruiser |
Module-level imports | Same-resource overlap |
| semgrep | Hand-written patterns | Discovery |
| parallax | Same-resource overlap regardless of shape | Single-instance bad patterns |
Status
Alpha. API and CLI are unstable until 1.0.
Writing an extractor
from pathlib import Path
from typing import Iterable
from parallax import Unit
from parallax.extractors.base import Extractor
class TerraformAwsExtractor(Extractor):
name = "terraform-aws"
def extract(self, root: Path) -> Iterable[Unit]:
for tf in root.rglob("*.tf"):
...
Register in parallax.extractors.BUILTIN_EXTRACTORS.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parallax_scan-0.3.0.tar.gz.
File metadata
- Download URL: parallax_scan-0.3.0.tar.gz
- Upload date:
- Size: 28.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd0c4e8a7105154beef960bc45b23d6789173b729ed0444981120e781f9df784
|
|
| MD5 |
82f63e1a1a075844325280c18cf899d1
|
|
| BLAKE2b-256 |
5e6e024cc637745da961f31538917b24d90bf802c0fdb7d3974d853207cfe0c9
|
File details
Details for the file parallax_scan-0.3.0-py3-none-any.whl.
File metadata
- Download URL: parallax_scan-0.3.0-py3-none-any.whl
- Upload date:
- Size: 27.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76a9af0f08fe29737db3f8e62ad7935a763fec61bb55816ef699d25ef86383e9
|
|
| MD5 |
fd29397f5fae1d4f9a604f092b7d8553
|
|
| BLAKE2b-256 |
3de6f910e87e45ead74da8942eb1ea71485d2f650a81143cdcb0ea3391f80f52
|