Hunt down code repetitions in Python projects
Project description
🔍 Python Repetition Hunter
Hunt down code repetitions like a pro detective
A powerful Python tool that analyzes your codebase to find repeated patterns and duplicated logic. Based on the Clojure repetition-hunter algorithm, this tool helps you identify opportunities for refactoring and code deduplication.
✨ Features
- 🎯 Smart Pattern Detection - Finds semantic duplications, not just copy-paste
- 🧠 AST-Based Analysis - Uses Abstract Syntax Trees for deep code understanding
- 🔧 Variable Normalization - Detects patterns even when variable names differ
- 📊 Complexity Scoring - Ranks findings by complexity × repetition count
- 🎛️ Configurable Thresholds - Tune sensitivity to your needs
- 📁 Recursive Directory Scanning - Analyze entire projects at once
📦 Installation
From PyPI (Recommended)
pip install python-repetition-hunter
From Source
git clone https://github.com/yourusername/python-repetition-hunter.git
cd python-repetition-hunter
pip install -e .
🚀 Quick Start
# After pip install, use the command directly
repetition-hunter my_code.py
# Scan entire project
repetition-hunter src/
# Find only high-complexity duplications
repetition-hunter --min-complexity 5 --min-repetition 3 src/
# Or run the module directly
python -m repetition_hunter my_code.py
📋 Usage
repetition-hunter [OPTIONS] PATHS...
Arguments:
PATHS Python files or directories to analyze
Options:
--min-complexity INT Minimum complexity threshold (default: 3)
--min-repetition INT Minimum repetition count (default: 2)
--sort [complexity|repetition] Sort results by complexity or repetition (default: complexity)
🎯 Example Output
3 repetitions of complexity 12
Line 15 - src/utils.py:
if data is None:
return None
result = []
for item in data:
if item > 0:
result.append(item * 2)
return result
Line 28 - src/processor.py:
if items is None:
return None
output = []
for element in items:
if element > 0:
output.append(element * 2)
return output
======================================================================
🧪 Test It Out
The project includes test_sample.py with intentional duplications to demonstrate the tool:
repetition-hunter test_sample.py
You'll see it catches patterns like:
- Similar data processing loops with different variable names
- Duplicate validation logic
- Repeated calculation patterns
🔧 How It Works
- Parse - Converts Python code to Abstract Syntax Trees
- Extract - Identifies all meaningful code nodes (skipping trivial ones)
- Normalize - Replaces variable names with generic placeholders
- Group - Clusters identical normalized patterns
- Score - Ranks by complexity × repetition count
- Report - Shows original code locations for each pattern
🎨 Why Use This?
- Reduce Technical Debt - Spot duplicated logic before it spreads
- Improve Code Quality - Identify refactoring opportunities
- Save Time - Automated detection vs manual code review
- Learn Patterns - Understand your codebase's repetition hotspots
🛠️ Requirements
- Python 3.6+
- No external dependencies (uses only standard library)
🤝 Contributing
Found a bug or have an idea? Feel free to open an issue or submit a PR!
📄 License
This project is open source. Use it, modify it, share it!
Happy hunting! 🎯
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file python_repetition_hunter-1.0.3.tar.gz.
File metadata
- Download URL: python_repetition_hunter-1.0.3.tar.gz
- Upload date:
- Size: 6.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6ebdf642bdc4a2cd0680a9328e6d373ccb807b219b7ff6c7e1e5b37aa81922c
|
|
| MD5 |
fcd5452be60c8108b8b36280d7e08de4
|
|
| BLAKE2b-256 |
9e1e01c874b1e5b75970c78b62b6ae684e1a9866837f00e4695e2f2d35d0f071
|
File details
Details for the file python_repetition_hunter-1.0.3-py3-none-any.whl.
File metadata
- Download URL: python_repetition_hunter-1.0.3-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13e1c127f32f61807ded6f17dba3d99b8731a70935dab1f623a80392cefa0265
|
|
| MD5 |
bbf0307338f13262a4a31793fbf86bac
|
|
| BLAKE2b-256 |
cae31b16b019dfd0169d596eb9b278de65e28a5beb9adfeef93841a0fdae8cad
|