Skip to main content

Copy Paste Is The Devil — language-agnostic code clone detection

Project description

CPITD: Copy Paste Is The Devil

A static code analysis tool that rakes you over the coals for using copy/paste. Because copy/paste is the devil. Language agnostic, and blazingly fast.


Installation

pip install cpitd

Requires Python 3.10+.

Development

For development (linting, tests, docs):

git clone https://github.com/scythia-marrow/cpitd.git
cd cpitd
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
pre-commit install

Quick Start

# Scan current directory
cpitd

# Scan specific paths
cpitd src/ lib/

# JSON output for CI pipelines
cpitd --format json src/ | jq '.[]'

Configuration

Settings can live in pyproject.toml so you don't repeat yourself on every invocation:

[tool.cpitd]
format = "human"
ignore = ["tests/fixtures/*", "vendor/*"]
suppress = ["*@abstractmethod*"]

CLI flags always override file config. For list options (ignore, suppress, languages), CLI values are appended to file values rather than replacing them.


Suppressing False Positives

Some clones are intentional—boilerplate required by a language or framework. Use --suppress to silence them.

--suppress PATTERN accepts fnmatch glob patterns matched against raw source lines (including one line of context above each clone chunk, to catch decorators). If any line in either side of a clone pair matches, the group is suppressed.

You can also annotate specific sites inline—the filter reads raw source, so comments are visible even though the tokenizer strips them. Add a suppression comment to any line inside or immediately above a clone:

Language Inline annotation
Python # cpitd: suppress
C/C++ // cpitd: suppress
Rust // cpitd: suppress

Then pass --suppress "*cpitd: suppress*" (or set it in pyproject.toml).

Python

Abstract base class implementations — ABCs force you to repeat method signatures across subclasses. Suppress them with:

cpitd src/ --suppress "*@abstractmethod*"

Or in pyproject.toml:

[tool.cpitd]
suppress = ["*@abstractmethod*", "*@override*"]

Protocol / interface boilerplate — if you use a decorator to mark protocol implementations (e.g. @protocol_impl), pass that pattern:

cpitd src/ --suppress "*@protocol_impl*"

C / C++

Header guards — every .h file has them. Suppress both styles:

cpitd src/ \
  --suppress "*#ifndef *_H*" \
  --suppress "*#pragma once*"

Or in pyproject.toml:

[tool.cpitd]
suppress = ["*#ifndef *_H*", "*#pragma once*"]
ignore = ["**/*.h"]   # alternatively, just skip headers entirely

Rust

Trait implementations — implementing the same trait for multiple types produces near-identical impl blocks. Suppress by matching the impl ... for ... line:

cpitd src/ --suppress "*impl * for *"

Derive macros#[derive(Debug, Clone, PartialEq)] lines repeat everywhere but are rarely meaningful clones. Suppress them:

cpitd src/ --suppress "*#[derive(*"

In pyproject.toml:

[tool.cpitd]
suppress = [
    "*impl*Display*for*",
    "*impl*From*for*",
    "*#[derive(*",
]

Pre-commit Hook

Add cpitd to .pre-commit-config.yaml as a local hook (cpitd must be installed in the environment where hooks run):

repos:
  - repo: local
    hooks:
      - id: cpitd-clone-detection
        name: cpitd (clone detection)
        entry: cpitd src/ --ignore "tests/fixtures/*"
        language: system
        pass_filenames: false
        always_run: true

Then install the hook:

pre-commit install

The hook runs cpitd on every commit. Tune entry with --suppress or any other flag -- or lean on [tool.cpitd] in pyproject.toml so the hook entry stays short.

To run the hook manually without committing:

pre-commit run cpitd-clone-detection

CLI Options Reference

Flag Default Description
--format human|json human Output format
--ignore PATTERN Glob patterns to exclude (repeatable)
--languages LANG Restrict to specific languages (repeatable)
--suppress PATTERN Suppress clones whose source lines match (repeatable)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpitd-0.3.0.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cpitd-0.3.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file cpitd-0.3.0.tar.gz.

File metadata

  • Download URL: cpitd-0.3.0.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for cpitd-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0dfc884f445d3bbaa3d00518c02b3632b4752cbc0073060e4e9eb9c70923a6f8
MD5 387156608b9aca444e398187658cc11d
BLAKE2b-256 2c7657d12c4f40c7d5d330a263f1601e0c9609f95cf5dda733aa8d7dd60b1ac4

See more details on using hashes here.

File details

Details for the file cpitd-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: cpitd-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for cpitd-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a4030303050b0d8910cd13d37c2a05163d4f780863243538e636125df32ce121
MD5 82bdd286c579879eb4e81aabf08c2057
BLAKE2b-256 8afc27331b3e5389545da13ee9f2d39bb3b6dede5fc5859e85910da7cc1368ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page