Deterministic architectural drift detection for AI-accelerated Python repositories through cross-file coherence analysis

These details have not been verified by PyPI

Project links

Project description

Drift — Deterministic architectural drift detection for AI-accelerated Python codebases

Repo: sauremilk/drift · Package: drift-analyzer · Command: drift · Requires: Python 3.11+

Start here

What is drift?

Drift is a deterministic static analyzer for architectural drift in AI-accelerated Python repositories. It detects architecture erosion through cross-file coherence problems such as pattern fragmentation, architecture violations, and structural hotspots before they become normal team habits.

Who is it for?

Python teams with fast-growing codebases where architecture matters
Tech leads who want fast structural feedback, not just style or type checks
Teams using AI coding tools and seeing more cross-file drift across modules

1-minute quickstart

pip install -q drift-analyzer
drift analyze --repo .

That gives you a drift score, the hottest modules, and actionable findings in one run.

Choose your path

Not sure where to start? Use the central docs routing page: Start Here.
Casual user: install drift, run drift analyze --repo ., and start with Quick Start and Configuration.
Evaluator: review Example Findings, Trust and Evidence, and Stability and Release Status before deciding on rollout.
Contributor: use CONTRIBUTING.md once you are ready to submit a fix, improve docs, or work on signal quality.
Core maintainer: use CONTRIBUTING.md, DEVELOPER.md, and POLICY.md for the full quality, architecture, and release guardrails.

Release status

The PyPI classifier remains Development Status :: 3 - Alpha intentionally.

That is not a claim that the whole tool is immature. It is a conservative release signal for a product whose core Python analysis is already usable, while some adjacent surfaces still have mixed maturity.

Area	Status	What that means today
Core Python analysis	Stable	Primary analysis path, CLI usage, and main signal set are the most production-ready parts of drift.
CI and SARIF workflow	Stable	Suitable for report-only rollout now, then selective gating once teams calibrate findings locally.
TypeScript support	Experimental	Optional support exists, but Python remains the primary target and the more validated path.
Embeddings-based parts	Optional / experimental	Not required for the core detector path and should be treated as exploratory add-ons.
Benchmark methodology	Evolving	Public and reproducible, but still conservative in its claims and not the final word on every repository shape.

Why keep Alpha for now: release signaling should reflect the least mature user-facing surfaces, not only the strongest path. Drift already has stable core workflows, but the overall product story still includes experimental and evolving areas.

See Stability and Release Status for the explicit matrix and the criteria for a future move toward Beta.

Example output

DRIFT SCORE  0.52
Top finding: PFS 0.85  Error handling split 4 ways  at src/api/routes.py:42
Next action: consolidate variants into one shared pattern

If you want CI, use this

- uses: sauremilk/drift@v1
  with:
    fail-on: none
    upload-sarif: "true"

Start report-only first. Tighten to fail-on: high once the team understands the signal quality in its own repo.

Try it on a demo project

git clone https://github.com/sauremilk/drift.git
cd drift/examples/demo-project
pip install -q drift-analyzer
drift analyze --repo .

The demo project contains intentional drift patterns, so you get useful findings immediately.

drift CLI demo

Why drift

When your team uses GitHub Copilot, Cursor, or other AI coding tools, code passes CI while the repository quietly accumulates architectural drift:

Pattern fragmentation: error handling is implemented 4 different ways across the same service
Boundary violations: the API layer imports directly from the database layer
Silent duplication: AI generates a new validator instead of finding the existing one
Churn hotspots: the same files change every sprint because the structure is unclear

Your linter, type checker, and test suite won't catch this. Drift does — deterministically, without any LLM in the pipeline. That makes drift useful for architectural drift detection in AI-accelerated Python codebases, with architecture erosion analysis and cross-file coherence findings that teams can act on.

What drift catches that other checks usually don't

Ruff / formatters / type checkers: local correctness and style signals, not cross-module coherence.
Semgrep / CodeQL / security scanners: risky flows and policy violations, not whether patterns fragment across a codebase.
Sonar / maintainability dashboards: broad quality heuristics, not a drift-specific score grounded in reproducible signal families.

Current public evidence: 15 real-world repositories in the study corpus, 15 scoring signals (all contributing to the composite score), and auto-calibration that rebalances weights at runtime. Full study → · Trust & limitations

Use cases

Pattern fragmentation in a connector layer

Problem: A FastAPI service has 4 connectors, each implementing error handling differently — bare except, custom exceptions, retry decorators, and silent fallbacks.

Solution:

drift analyze --repo . --sort-by impact --max-findings 5

Output: PFS finding with score 0.96 — "26 error_handling variants in connectors/" — shows exactly which files diverge and suggests consolidation.

Architecture boundary violation in a monorepo

Problem: A database model file imports directly from the API layer, creating a circular dependency that breaks test isolation.

Solution:

drift check --fail-on high

Output: AVS finding — "DB import in API layer at src/api/auth.py:18" — blocks the CI pipeline until the import direction is fixed.

Duplicate utility code from AI-generated scaffolding

Problem: AI code generation created 6 identical _run_async() helper functions across separate task files instead of finding the existing shared utility.

Solution:

drift analyze --repo . --format json | jq '.findings[] | select(.signal=="MDS")'

Output: MDS findings listing all 6 locations with similarity scores ≥ 0.95, enabling a single extract-to-shared-module refactoring.

Concrete example findings

If you are evaluating drift, the fastest way to understand the value is to look at concrete findings rather than abstract signal names.

See docs-site/product/example-findings.md for 5 short examples with code, the likely finding, why it matters, and how to fix it:

Pattern fragmentation: three incompatible error-handling patterns in one module
Mutant duplicate: two copied formatter functions that will drift apart later
Architecture violation: a db/ module importing from api/
Doc-implementation drift: README structure that no longer matches the repo
Temporal volatility: a small file that became a churn hotspot in git history

More setup options

Full GitHub Action (recommended: start report-only)

name: Drift

on: [push, pull_request]

jobs:
  drift:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: sauremilk/drift@v1
        with:
          fail-on: none           # report findings without blocking CI
          upload-sarif: "true"    # findings appear as PR annotations

Once the team has reviewed findings for a few sprints, tighten the gate:

      - uses: sauremilk/drift@v1
        with:
          fail-on: high           # block only high-severity findings
          upload-sarif: "true"

CI gate (local)

drift check --fail-on none    # report-only
drift check --fail-on high    # block on high-severity findings

pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: drift
        name: drift
        entry: drift check --fail-on high
        language: system
        pass_filenames: false
        always_run: true

More setup paths:

What you get

╭─ drift analyze  myproject/ ──────────────────────────────────────────────────╮
│  DRIFT SCORE  0.52  │  87 files  │  412 functions  │  AI: 34%  │  2.1s      │
╰──────────────────────────────────────────────────────────────────────────────╯

                        Module Drift Ranking
  Module                           Score  Findings  Top Signal
  ─────────────────────────────────────────────────────────────
  src/api/routes/                   0.71       12   PFS 0.85
  src/services/auth/                0.58        7   AVS 0.72
  src/db/models/                    0.41        4   MDS 0.61

┌──┬────────┬───────┬──────────────────────────────────────┬──────────────────────┐
│  │ Signal │ Score │ Title                                │ Location             │
├──┼────────┼───────┼──────────────────────────────────────┼──────────────────────┤
│◉ │ PFS    │  0.85 │ Error handling split 4 ways          │ src/api/routes.py:42 │
│◉ │ AVS    │  0.72 │ DB import in API layer               │ src/api/auth.py:18   │
│○ │ MDS    │  0.61 │ 3 near-identical validators          │ src/utils/valid.py   │
└──┴────────┴───────┴──────────────────────────────────────┴──────────────────────┘

Drift scores all 15 signal families:

PFS Pattern Fragmentation (0.16)
AVS Architecture Violations (0.16)
MDS Mutant Duplicates (0.13)
TVS Temporal Volatility (0.13)
EDS Explainability Deficit (0.09)
SMS System Misalignment (0.08)
DIA Doc-Implementation Drift (0.04)
BEM Broad Exception Monoculture (0.04)
TPD Test Polarity Deficit (0.04)
NBV Naming Contract Violation (0.04)
GCD Guard Clause Deficit (0.03)
BAT Bypass Accumulation (0.03)
ECM Exception Contract Drift (0.03)
COD Cohesion Deficit (0.01)
CCC Co-Change Coupling (0.005)

Signal details and scoring model:

How drift compares

Data sourced from STUDY.md §9 and benchmark_results/.

Capability	drift	SonarQube	pylint / mypy	jscpd / CPD
Pattern Fragmentation (N variants per module)	Yes	No	No	No
Near-Duplicate Detection (AST structural)	Yes	Partial (text)	No	Yes (text)
Architecture Violation (layer + circular deps)	Yes	Partial	No	No
Temporal Volatility (churn anomalies)	Yes	No	No	No
System Misalignment (novel imports)	Yes	No	No	No
Composite Health Score	Yes	Yes (different)	No	No
Zero Config (no server needed)	Yes	No (server)	Partial	Yes
SARIF Output (GitHub Code Scanning)	Yes	Yes	No	No
TypeScript Support	Optional ¹	Yes	No	Yes

¹ Experimental via drift-analyzer[typescript]. Python is the primary target.

Drift is designed to complement linters and security scanners, not replace them. Recommended stack: linter (style) + type checker (types) + drift (coherence) + security scanner (SAST).

Full comparison: STUDY.md §9 — Tool Landscape Comparison

Ideal for

Python teams using AI coding tools (Copilot, Cursor, Cody) in existing codebases
Tech leads who want to catch structural erosion before it becomes team habit
CI pipelines that need a deterministic architecture check without LLM infrastructure

Teams often describe drift as an architectural linter for repositories where GitHub Copilot and similar assistants accelerate local delivery faster than shared design conventions can keep up.

Who should adopt now

teams with Python 3.11+ already available locally and in CI
repositories with 20+ files and recurring refactors across modules
teams using AI assistance enough that copy-modify drift and boundary erosion are real review problems

Who should wait

tiny repos where a few findings would dominate the score
teams looking for bug finding, security review, or strict pass/fail quality gates on day one
teams without Python 3.11+ in their execution path yet

Best first target

Drift works best on Python repositories with 20+ files and some history. If you see too many findings on the first run:

Start with drift check --fail-on none to just observe.
Focus on findings with score ≥ 0.7 — those have the strongest signal.
Ignore generated code or vendor directories (configure exclusions in drift.yaml).

Don't use drift if...

you expect bug finding, security scanning, or type safety enforcement
you need zero false positives on a tiny repository from day one
you want one absolute score to replace code review judgment

Drift is most useful when teams treat the score as orientation and the findings as investigation prompts.

Small-team rollout

The safest adoption path is progressive:

Start with drift analyze locally and review the top findings.
Add drift check in CI as report-only discipline for a short period.
Gate only on high findings once the team understands the output.
Tune config and policies only after reviewing real findings in your repo.

Recommended guides:

Trust and limitations

Public claims safe to repeat for v0.8.2: Drift is deterministic, benchmarked on 15 real-world repositories in the current study corpus, and uses 15 scoring signals with auto-calibration for runtime weight rebalancing and small-repo noise suppression.

What's limited: Benchmark validation is single-rater; not yet independently replicated. Small repos can be noisy. Temporal signals depend on clone depth. The composite score is orientation, not a verdict.

What's next: Independent external validation, multi-rater ground truth, signal-specific confidence intervals.

Drift is designed to earn trust through determinism and reproducibility:

no LLMs in the detection pipeline
reproducible CLI and CI output
signal-specific interpretation instead of score-only messaging
explicit benchmarking and known-limitations documentation

Interpreting the score

The drift score measures structural entropy, not code quality. Keep these principles in mind:

Interpret deltas, not snapshots. Use drift trend to track changes over time. A single score in isolation has limited meaning.
Temporary increases are expected during migrations. Two coexisting patterns (old and new) will raise PFS/MDS signals. This is the migration happening, not a problem.
Deliberate polymorphism is not erosion. Strategy, Adapter, and Plugin patterns produce structural similarity that MDS flags as duplication. Findings include a deliberate_pattern_risk hint — verify intent before acting.
The score rewards reduction, not correctness. Deleting code lowers the score just like refactoring does. Do not optimize for a low score — optimize for understood, intentional structure.

For a detailed discussion of epistemological boundaries (what drift can and cannot see), see STUDY.md §14.

Drift vs. erosion: Without layer_boundaries in drift.yaml, drift detects emergent drift — structural patterns that diverge without explicit prohibition. With configured layer_boundaries, drift additionally performs conformance checking against a defined architecture. Both modes are complementary: drift does not replace dedicated architecture conformance frameworks (e.g. PyTestArch for executable layer rules in pytest), but catches cross-file coherence issues those tools do not model.

Start with the strongest, most actionable findings first. If a signal is noisy for your repository shape, tune or de-emphasize it instead of forcing an early hard gate.

Contributing

Drift seeks contributions that increase the credibility of static architecture findings: reproducible cases, better explainability, fewer false alarms, and clearer next actions.

If you run drift on your codebase and get surprising results — good or bad — please open an issue or start a discussion.

New here? Start contributing

Pick an issue labelled good first issue
git clone https://github.com/sauremilk/drift.git && cd drift && make install
make test-fast — confirm everything passes
Make your change, then open a PR

Typical first contributions:

Add a ground-truth fixture for a false positive or false negative
Improve a finding's explanation text to be more actionable
Write a test for an untested edge case
Fix or extend signal documentation with a concrete example

What we value most: reproducibility, explainability, false-alarm reduction.
What we deprioritize: new output formats without insight value, comfort features, complexity without analysis improvement.

See CONTRIBUTING.md for the full guide and ROADMAP.md for current priorities.

Documentation map

Status

drift has working CLI, GitHub Action, configuration, JSON/SARIF output, benchmark material, and active tests.

Current release posture:

PyPI classifier remains Alpha intentionally
core Python analysis: stable
CI and SARIF workflow: stable
TypeScript support: experimental
embeddings-based parts: optional / experimental
benchmark methodology: evolving

Rationale and matrix: Stability and Release Status

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.15.1

Apr 18, 2026

2.15.0

Apr 18, 2026

2.14.0

Apr 18, 2026

2.13.0

Apr 17, 2026

2.12.1

Apr 17, 2026

2.12.0

Apr 17, 2026

2.11.2

Apr 17, 2026

2.11.1

Apr 17, 2026

2.11.0

Apr 16, 2026

2.10.1

Apr 14, 2026

2.10.0

Apr 13, 2026

2.9.16

Apr 13, 2026

2.9.15

Apr 12, 2026

2.9.14

Apr 12, 2026

2.9.13

Apr 12, 2026

2.9.12

Apr 12, 2026

2.9.11

Apr 12, 2026

2.9.10

Apr 12, 2026

2.9.9

Apr 12, 2026

2.9.8

Apr 12, 2026

2.9.7

Apr 12, 2026

2.9.6

Apr 11, 2026

2.9.5

Apr 11, 2026

2.9.4

Apr 11, 2026

2.9.3

Apr 11, 2026

2.9.2

Apr 11, 2026

2.9.1

Apr 10, 2026

2.9.0

Apr 10, 2026

2.8.1

Apr 10, 2026

2.8.0

Apr 9, 2026

2.7.2

Apr 9, 2026

2.7.1

Apr 9, 2026

2.7.0

Apr 9, 2026

2.6.2

Apr 8, 2026

2.6.1

Apr 7, 2026

2.6.0

Apr 7, 2026

2.5.3

Apr 7, 2026

2.5.1

Apr 6, 2026

2.5.0

Apr 5, 2026

2.4.5

Apr 5, 2026

2.4.4

Apr 4, 2026

2.4.3

Apr 4, 2026

2.4.2

Apr 4, 2026

2.4.1

Apr 3, 2026

2.4.0

Apr 3, 2026

2.3.1

Apr 3, 2026

2.3.0

Apr 3, 2026

2.2.0

Apr 3, 2026

2.1.3

Apr 2, 2026

2.1.2

Apr 2, 2026

2.1.1

Apr 2, 2026

2.1.0

Apr 2, 2026

2.0.1

Apr 2, 2026

2.0.0

Apr 2, 2026

1.4.2

Apr 2, 2026

1.3.6

Apr 1, 2026

1.3.3

Apr 1, 2026

1.3.2

Apr 1, 2026

1.3.0

Apr 1, 2026

1.1.16

Mar 31, 2026

1.1.10

Mar 30, 2026

1.1.8

Mar 30, 2026

1.1.6

Mar 30, 2026

1.1.5

Mar 30, 2026

1.1.3

Mar 30, 2026

1.1.2

Mar 30, 2026

0.10.10

Mar 30, 2026

0.10.9

Mar 30, 2026

0.10.8

Mar 29, 2026

0.10.7

Mar 29, 2026

0.10.5

Mar 29, 2026

0.10.3

Mar 29, 2026

0.10.2

Mar 29, 2026

0.10.1

Mar 29, 2026

0.10.0

Mar 29, 2026

This version

0.9.0

Mar 28, 2026

0.8.1

Mar 28, 2026

0.7.4

Mar 27, 2026

0.7.1

Mar 27, 2026

0.5.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drift_analyzer-0.9.0.tar.gz (816.6 kB view details)

Uploaded Mar 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

drift_analyzer-0.9.0-py3-none-any.whl (181.5 kB view details)

Uploaded Mar 28, 2026 Python 3

File details

Details for the file drift_analyzer-0.9.0.tar.gz.

File metadata

Download URL: drift_analyzer-0.9.0.tar.gz
Upload date: Mar 28, 2026
Size: 816.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for drift_analyzer-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`1c7549d3428a29e4494adbd5605bb2b2129a499218721c814d18240487a0473b`
MD5	`daab6bcdd43dc81919f9411277f59d0b`
BLAKE2b-256	`55f7c3df5b8ee433cb773bf6f8a2201a735cc4b8644d48989418e7eeb2e79538`

See more details on using hashes here.

File details

Details for the file drift_analyzer-0.9.0-py3-none-any.whl.

File metadata

Download URL: drift_analyzer-0.9.0-py3-none-any.whl
Upload date: Mar 28, 2026
Size: 181.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for drift_analyzer-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8ab079f869395828a821d217d38928e7afbb627056d027f97a0d1e110e3f5c30`
MD5	`cb19727fa7348c8a2766857b9099e3a4`
BLAKE2b-256	`dfc976801748f5df0c6decd8ccdf6a7a2c411e43ec2abdb3f556457a71af51f1`

See more details on using hashes here.

drift-analyzer 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Drift — Deterministic architectural drift detection for AI-accelerated Python codebases

Start here

1-minute quickstart

Choose your path

Release status

Example output

If you want CI, use this

Try it on a demo project

Why drift

What drift catches that other checks usually don't

Use cases

Pattern fragmentation in a connector layer

Architecture boundary violation in a monorepo

Duplicate utility code from AI-generated scaffolding

Concrete example findings

More setup options

Full GitHub Action (recommended: start report-only)

CI gate (local)

pre-commit hook

What you get

How drift compares

Ideal for

Who should adopt now

Who should wait

Best first target

Don't use drift if...

Small-team rollout

Trust and limitations

Interpreting the score

Contributing

New here? Start contributing

Documentation map

Status

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes