Skip to main content

Hunt down code repetitions in Python projects

Project description

🔍 Python Repetition Hunter

Hunt down code repetitions like a pro detective

A powerful Python tool that analyzes your codebase to find repeated patterns and duplicated logic. Based on the Clojure repetition-hunter algorithm, this tool helps you identify opportunities for refactoring and code deduplication.

✨ Features

  • 🎯 Smart Pattern Detection - Finds semantic duplications, not just copy-paste
  • 🧠 AST-Based Analysis - Uses Abstract Syntax Trees for deep code understanding
  • 🔧 Variable Normalization - Detects patterns even when variable names differ
  • 📊 Complexity Scoring - Ranks findings by complexity × repetition count
  • 🎛️ Configurable Thresholds - Tune sensitivity to your needs
  • 📁 Recursive Directory Scanning - Analyze entire projects at once

📦 Installation

From PyPI (Recommended)

pip install python-repetition-hunter

From Source

git clone https://github.com/yourusername/python-repetition-hunter.git
cd python-repetition-hunter
pip install -e .

🚀 Quick Start

# After pip install, use the command directly
repetition-hunter my_code.py

# Scan entire project
repetition-hunter src/

# Find only high-complexity duplications
repetition-hunter --min-complexity 5 --min-repetition 3 src/

# Or run the module directly
python -m repetition_hunter my_code.py

📋 Usage

repetition-hunter [OPTIONS] PATHS...

Arguments:
  PATHS                    Python files or directories to analyze

Options:
  --min-complexity INT     Minimum complexity threshold (default: 3)
  --min-repetition INT     Minimum repetition count (default: 2)  
  --sort [complexity|repetition]  Sort results by complexity or repetition (default: complexity)

🎯 Example Output

3 repetitions of complexity 12

Line 15 - src/utils.py:
if data is None:
    return None
result = []
for item in data:
    if item > 0:
        result.append(item * 2)
return result

Line 28 - src/processor.py:
if items is None:
    return None
output = []
for element in items:
    if element > 0:
        output.append(element * 2)
return output

======================================================================

🧪 Test It Out

The project includes test_sample.py with intentional duplications to demonstrate the tool:

repetition-hunter test_sample.py

You'll see it catches patterns like:

  • Similar data processing loops with different variable names
  • Duplicate validation logic
  • Repeated calculation patterns

🔧 How It Works

  1. Parse - Converts Python code to Abstract Syntax Trees
  2. Extract - Identifies all meaningful code nodes (skipping trivial ones)
  3. Normalize - Replaces variable names with generic placeholders
  4. Group - Clusters identical normalized patterns
  5. Score - Ranks by complexity × repetition count
  6. Report - Shows original code locations for each pattern

🎨 Why Use This?

  • Reduce Technical Debt - Spot duplicated logic before it spreads
  • Improve Code Quality - Identify refactoring opportunities
  • Save Time - Automated detection vs manual code review
  • Learn Patterns - Understand your codebase's repetition hotspots

🛠️ Requirements

  • Python 3.6+
  • No external dependencies (uses only standard library)

🤝 Contributing

Found a bug or have an idea? Feel free to open an issue or submit a PR!

📄 License

This project is open source. Use it, modify it, share it!


Happy hunting! 🎯

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_repetition_hunter-1.0.3.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_repetition_hunter-1.0.3-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file python_repetition_hunter-1.0.3.tar.gz.

File metadata

  • Download URL: python_repetition_hunter-1.0.3.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for python_repetition_hunter-1.0.3.tar.gz
Algorithm Hash digest
SHA256 e6ebdf642bdc4a2cd0680a9328e6d373ccb807b219b7ff6c7e1e5b37aa81922c
MD5 fcd5452be60c8108b8b36280d7e08de4
BLAKE2b-256 9e1e01c874b1e5b75970c78b62b6ae684e1a9866837f00e4695e2f2d35d0f071

See more details on using hashes here.

File details

Details for the file python_repetition_hunter-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for python_repetition_hunter-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 13e1c127f32f61807ded6f17dba3d99b8731a70935dab1f623a80392cefa0265
MD5 bbf0307338f13262a4a31793fbf86bac
BLAKE2b-256 cae31b16b019dfd0169d596eb9b278de65e28a5beb9adfeef93841a0fdae8cad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page