Skip to main content

Detect AI-generated code anti-patterns.

Project description

💎 CodePolish

CI PyPI version License: MIT

CodePolish Demo

Detect AI-generated code anti-patterns in your Python codebase. CodePolish is a powerful, extensible CLI tool designed to catch AI-specific bugs and hallucinations that traditional linters miss. It helps keep your Python code clean, safe, and free from cross-language leaks.


🤔 Why CodePolish Exists

Traditional linters (like Flake8, Pylint) are excellent at catching syntax and style issues. However, AI-generated code introduces new, unique failure patterns they weren't designed to detect:

  • Hallucinated imports: Packages and functions that look real but don't exist in the environment.
  • Cross-language leakage: JavaScript, Java, or Ruby syntax slipping into Python (e.g., .push(), .length, .equals()).
  • Placeholder code: pass, TODO, and functions that do nothing.
  • Confident wrongness: Code that looks structurally perfect but contains subtle anti-patterns (like mutable defaults).

CodePolish specifically targets these AI-driven anomalies before they hit production.


🎯 What It Catches

CodePolish evaluates code against four key axes of AI deficiencies:

Axis What It Detects Examples
📢 Noise Debug artifacts, redundant or "hedging" comments print("here"), # increment x above x += 1
🤥 Lies Hallucinations, placeholders, wrong state def process(): pass, hallucinated packages, mutable defaults
💀 Soul Over-engineering, bad style Deep nesting, god functions, overly complex logic
🏗️ Structure General Anti-patterns Bare excepts, star imports, single-method classes

⚡ Quick Start

pip install codepolish
codepolish check .

Example Output

                        CodePolish Anti-Patterns Found                         
+-----------------------------------------------------------------------------+
| File        | Line | Rule  | Severity | Description                | Points |
|-------------+------+-------+----------+----------------------------+--------|
| bad_code.py |    3 | AI002 | HIGH     | Hallucinated import:       |     25 |
|             |      |       |          | Module                     |        |
|             |      |       |          | 'hallucinated_module' not  |        |
|             |      |       |          | found in current           |        |
|             |      |       |          | environment.               |        |
| bad_code.py |    2 | JS001 | HIGH     | Possible cross-language    |     20 |
|             |      |       |          | Javascript leakage         |        |
|             |      |       |          | detected: 'push'           |        |
| bad_code.py |    1 | AI001 | CRITICAL | Mutable default argument   |     15 |
|             |      |       |          | detected. AI often makes   |        |
|             |      |       |          | this mistake. Use None     |        |
|             |      |       |          | instead.                   |        |
+-----------------------------------------------------------------------------+

Total Anti-Pattern Score: 60
Verdict: SLOPPY

🔍 Pattern Examples

Critical Severity: AI's Favorite Mistake

# 🚨 AI001: Mutable Default Argument
def process_items(items=[]):  # Bug: shared state between calls
    items.append(1)
    return items

# ✅ Fix: Use None and initialize inside
def process_items(items=None):
    if items is None:
        items = []
    items.append(1)
    return items

High Severity: Cross-Language Leakage

# 🚨 JS001: JavaScript syntax in Python
my_list = [1, 2, 3]
my_list.push(4) # AttributeError: 'list' object has no attribute 'push'

# ✅ Fix: Use standard Python
my_list.append(4)

🌐 Language Patterns Detected

LLMs are trained on code from many languages. When generating Python, they sometimes produce syntax patterns from other environments. CodePolish spots these cross-language leaks:

Origin Language Common AI Mistakes The Python Fix
JavaScript .push(), .length, .forEach() .append(), len(), for loop
Java .equals(), .toString(), .isEmpty() ==, str(), not obj
Ruby .each, .nil?, .first, .last for loop, is None, [0], [-1]
Go fmt.Println(), nil print(), None
C# .Length, .Count, .ToLower() len(), len(), .lower()
PHP strlen(), array_push(), explode() len(), .append(), .split()

🚫 What CodePolish Is Not

CodePolish does not replace:

  • Human code review
  • Traditional linters (Pylint, Flake8, Ruff)
  • Type checkers (mypy, pyright)
  • Security scanners (Bandit, Semgrep)

It is designed to seamlessly complement them by catching the unique patterns introduced by Large Language Models.


📦 Installation

You can install CodePolish locally using pip:

git clone https://github.com/vikramkrishna1705-beep/CodePolish.git
cd CodePolish
pip install -e .

⚙️ Configuration

You can configure CodePolish by adding a [tool.codepolish] section to your pyproject.toml file.

[tool.codepolish]
ignore_rules = ["JS001"]
exclude = ["tests/*", "venv/*", ".git/*"]

🪝 Pre-commit

To use CodePolish with pre-commit, add the following to your .pre-commit-config.yaml file:

repos:
  - repo: https://github.com/vikramkrishna1705-beep/CodePolish
    rev: v0.1.0  # Use the latest version or commit hash
    hooks:
      - id: codepolish

🛡️ Built-in Rules

CodePolish currently ships with the following built-in detection rules:

Rule ID Rule Name Description Severity
AI001 MutableDefaultsRule Detects dangerous mutable default arguments (e.g., def func(x=[])) often hallucinated by AI. CRITICAL (15)
JS001 JSLeakRule Detects JavaScript leakage in Python code (e.g., .push(), .length, .forEach()). HIGH (20)
AI002 HallucinatedImportRule Detects module imports that don't actually exist in the environment. HIGH (25)
BP001 BareExceptRule Detects dangerous bare except: clauses that can catch critical system exit signals. HIGH (20)

🧩 Adding Custom Rules

CodePolish uses a dynamic plugin architecture. To add a new rule:

  1. Create a new file in codepolish/rules/ starting with rule_ (e.g., rule_my_custom_check.py).
  2. Create a class that inherits from BaseRule and implements ast.NodeVisitor methods.
import ast
from codepolish.rules import BaseRule

class MyCustomRule(BaseRule):
    def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
        if node.name == "do_something_bad":
            self.record_issue(
                node=node,
                rule_id="CUST001",
                description="Found a bad function name.",
                points=10,
                severity="MEDIUM"
            )
        self.generic_visit(node)

CodePolish will automatically discover and run your new rule!


🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.


🙏 Acknowledgments

  • sloppylint: Major inspiration for the concept of linting AI-generated slop.
  • KarpeSlop: The original AI Slop Linter for TypeScript.
  • Andrej Karpathy's commentary on AI-generated code quality.
  • Counterfeit Code - MIT research on "looks right but doesn't work" patterns.

📝 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codepolish-0.1.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codepolish-0.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file codepolish-0.1.0.tar.gz.

File metadata

  • Download URL: codepolish-0.1.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for codepolish-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b17be31cc0a1ec35428946ba68166d91d50c033642bea0a08f4fc5ebe6955c82
MD5 1f2736e1728fa1d0edfae7c6d1b0958a
BLAKE2b-256 61e71c5cbd21d526e927385329a0d01872491ec8572ccd1b9ad9f7e665f4fe32

See more details on using hashes here.

File details

Details for the file codepolish-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: codepolish-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for codepolish-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0577c808689499e437c8fb8b5f6fc7dd7d108eac62521373e089290ccd693e50
MD5 728f93c1e4a9d44d4f90a2cb1f97642c
BLAKE2b-256 e3ee0b4f62655ca3c6b5ee43d4d436ab3976b1d2fe49262a00aa3931366cb81d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page