Skip to main content

A tool to detect duplicate variable names within the same scope in Python files

Project description

check-duplicate-variables

A Python tool that detects duplicate variable names within the same scope (module or class) in Python files. This tool is designed to help identify potential copy-paste errors, refactoring issues, and maintain code quality in automated testing frameworks, particularly page object models.

Features

  • AST-based Analysis: Uses Python's Abstract Syntax Tree (AST) for accurate parsing
  • Scope-aware Detection: Identifies duplicates within module-level and class-level scopes separately
  • Value Comparison: Reports whether duplicate assignments have the same or different values
  • Multiple Assignment Types: Handles standard assignments, type-annotated assignments, tuple unpacking, and attribute assignments
  • Recursive Directory Scanning: Automatically scans all Python files in a directory tree
  • CI/CD Friendly: Provides exit codes and formatted output suitable for continuous integration pipelines
  • Error Handling: Gracefully handles file I/O errors, encoding issues, and syntax errors

Installation

Install from PyPI

pip install check-duplicate-variables

Install from Source

git clone https://github.com/pandiyarajk/check-duplicate-python-variables.git
cd check-duplicate-python-variables
pip install .

Requirements

  • Python 3.7 or higher
  • No additional packages required (uses only standard library: ast, os, sys, collections, typing)

Usage

Basic Usage

After installation, use the check-duplicate-variables command:

check-duplicate-variables

This will look for a pageobjects directory in the current working directory.

Custom Directory

Specify a custom directory to scan:

check-duplicate-variables /path/to/your/python/files

As a Python Module

You can also run it as a Python module:

python -m check_duplicate_variables [directory]

or

python -m check_duplicate_variables [filepath]

Standalone Script

If you have the standalone script, you can run it directly:

python check_duplicate_variables.py [directory]

Exit Codes

  • 0: No duplicates found and no errors encountered
  • 1: Duplicates found or I/O/parse errors occurred

Output Format

The tool outputs one line per duplicate variable found:

file_path: variable_name (same values) - line1, line2, line3
file_path: variable_name (different values) - line1, line2

Example Output

pageobjects/login_page.py: username (same values) - 15, 23, 45
pageobjects/login_page.py: password (different values) - 18, 32
pageobjects/dashboard.py: button_text (same values) - 10, 25
  • (same values): All assignments have identical values (potential copy-paste error)
  • (different values): Assignments have different values (intentional reassignment or bug)

How It Works

  1. File Scanning: Recursively walks the specified directory and finds all .py files
  2. AST Parsing: Each file is parsed using Python's ast module
  3. Variable Tracking: The VariableAnalyzer visitor class traverses the AST and tracks all variable assignments per scope
  4. Duplicate Detection: Identifies variables assigned more than once in the same scope
  5. Value Comparison: Extracts and normalizes the source code of assigned values to compare them
  6. Reporting: Outputs results in a CI-friendly format with line numbers and value comparison status

Supported Assignment Types

The tool detects duplicates in the following assignment patterns:

  • Simple assignments: x = 1
  • Type-annotated assignments: x: int = 1
  • Multiple assignments: x = y = 1
  • Tuple unpacking: a, b = 1, 2
  • Attribute assignments: self.attr = value
  • Subscript assignments: x[i] = value

Use Cases

  • Page Object Models: Detect duplicate locators or element definitions in Selenium/Playwright page objects
  • Code Quality: Identify copy-paste errors and refactoring opportunities
  • Pre-commit Hooks: Integrate into Git pre-commit hooks to catch issues before commit
  • CI/CD Pipelines: Add to GitHub Actions, GitLab CI, or other CI/CD workflows
  • Code Reviews: Automated detection of duplicate variable definitions

Integration Examples

GitHub Actions

name: Check Duplicate Variables

on: [push, pull_request]

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      - name: Install check-duplicate-variables
        run: pip install check-duplicate-variables
      - name: Check for duplicate variables
        run: check-duplicate-variables ./src

Pre-commit Hook

Create .git/hooks/pre-commit:

#!/bin/bash
check-duplicate-variables ./pageobjects
if [ $? -ne 0 ]; then
    echo "Duplicate variables detected. Please fix before committing."
    exit 1
fi

Limitations

  • Only detects duplicates within the same scope (module or class)
  • Does not detect duplicates across different scopes (e.g., module variable vs. class variable with same name)
  • Value comparison is based on source code normalization, not runtime evaluation
  • Does not handle dynamically generated variable names

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Pandiyaraj Karuppasamy

Repository

https://github.com/pandiyarajk/check-duplicate-python-variables.git

Changelog

See CHANGE_LOG.md for version history and changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

check_duplicate_variables-1.0.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

check_duplicate_variables-1.0.1-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file check_duplicate_variables-1.0.1.tar.gz.

File metadata

File hashes

Hashes for check_duplicate_variables-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ccbd398a2ae85bc3e4323ddcfd6c39e6794ccdc845b96758be3980fa83f36dd7
MD5 886b2f1773c2b375674ab20bfc64cb9f
BLAKE2b-256 ad675e0d74fe06f0615b3a5421f9f44097d3894b280655604fb5ff527cc14741

See more details on using hashes here.

File details

Details for the file check_duplicate_variables-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for check_duplicate_variables-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d05059393ff5c8c6023ef2e74458d22286ba2906dcf0fcf189feb724768727ef
MD5 3a738949c98c1ec8f3071d7ebf00ab70
BLAKE2b-256 4d3906dea9e567faab273a843a865e20ff180f73181f9067dfeed59582caa197

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page