Skip to main content

Differential execution tracer that finds the exact file, line, and root cause of any flaky test.

Project description

FLAKEMARK

pytest-flakemark — Find the exact line where your flaky test breaks.

Not "your test is flaky." The actual file. The actual line. The actual fix.

Built by Khushdeep Sharma.


The Problem

FAILED tests/test_login.py::test_user_session
[Flaky — rerunning]
PASSED tests/test_login.py::test_user_session

Every existing tool gives you this. It tells you nothing new.

What FLAKEMARK Gives You

--------------------------------------------------------
FLAKEMARK - Flaky Test Root Cause Found
--------------------------------------------------------
File:       tests/test_login.py
Line:       47
Function:   test_user_session
Type:       timing_delta
Cause:      Race condition or timing dependency

Detail:     Line 47: 1.2ms (pass) vs 148.3ms (fail) — 124x timing difference.

Fix:        Replace time.sleep(N) with threading.Event().wait()
            or asyncio.wait_for(). Never hardcode sleep durations in tests.

Confidence: 85%  |  Total divergences: 1
--------------------------------------------------------

How FLAKEMARK Works

FLAKEMARK instruments your test at the AST level, runs it twice simultaneously, records every operation in both runs, then finds the exact line where the two executions diverged. That divergence is your bug.

Your test
  ├── Run 1 (instrumented) → ExecutionTrace A  [op, op, op ...]
  └── Run 2 (instrumented) → ExecutionTrace B  [op, op, op ...]
                                                     ↓
                                       DifferentialAnalyser
                                       Two-pointer trace walk
                                                     ↓
                                   "Line 47: 124x timing difference"

Install

pip install pytest-flakemark

Requires Python 3.10+. The only external dependency is pytest.


Usage — 4 Ways

1. Source string

from flakemark import FlakeMark

source = """
import random
def test_flaky():
    result = random.randint(0, 1)
    assert result == 1
"""

report = FlakeMark.diagnose_source(source, "test_flaky", runs=10)
print(report)

2. Real test file

from flakemark import FlakeMark

report = FlakeMark.diagnose_file(
    filepath       = "tests/test_api.py",
    test_func_name = "test_user_session",
    runs           = 6,
    project_root   = "/path/to/your/project",
)
print(report)

3. Batch scan entire folder

from flakemark import FlakeMark

results = FlakeMark.diagnose_batch("tests/", runs=4)

flaky = {k: v for k, v in results.items() if v.is_found()}
print(f"FLAKEMARK found {len(flaky)} flaky tests:\n")
for name, report in flaky.items():
    print(f"  {name}")
    print(f"  Line {report.primary.line}{report.primary.divergence_type.value}")
    print(f"  Fix: {report.primary.fix[:60]}")

4. pytest CLI (after pip install)

pytest --flakemark-diagnose tests/
pytest --flakemark-diagnose --flakemark-runs=8 tests/test_api.py

What FLAKEMARK Detects

Type What it means Root cause
value_mismatch Same line, different value random, shared state
timing_delta Same op, 3x+ slower time.sleep(), race condition
thread_race Same op, different thread Missing Lock()
sequence_break Different execution path Test order dependency
missing_event One run skipped an operation Conditional on external state
early_termination One run ended much sooner Timeout, unhandled exception

Parameters

Parameter Default Meaning
runs 4 Times to run. Use 10+ for low-frequency flakes
timeout 30 Seconds before a run is killed
project_root os.getcwd() Project root so imports work

Accuracy

python tests/test_accuracy.py
Score: 15/15 = 100%
  ✓  VALUE_MISMATCH detected   (78% confidence)
  ✓  TIMING_DELTA detected     (85% confidence)
  ✓  THREAD_RACE detected      (93% confidence)
  ✓  EARLY_TERMINATION detected (55% confidence)
  ✓  No false positive on stable test (0%)
  ✓  Async test handled correctly

Comparison to Other Tools

FLAKEMARK FlakyGuard pytest-randomly CANNIER Divergent
Finds exact root cause line Yes No No No Yes (JS only)
Python / pytest Yes No (Java) Yes Yes No (JS)
AST instrumentation Yes No No No Partial
Subprocess isolation Yes No No No No
Async test support Yes No Yes No No
Zero dependencies Yes No No No No

Project Structure

flakemark/
├── flakemark/
│   ├── __init__.py              FlakeMark, DivergenceReport, DivergenceType
│   ├── engine.py                FlakeMark class — diagnose_file/source/batch
│   ├── pytest_plugin.py         pytest plugin — --flakemark-diagnose flag
│   ├── tracer/
│   │   └── instrument.py        TraceInserter + run_instrumented()   ← CORE IP
│   └── differ/
│       └── divergence.py        DifferentialAnalyser + DivergenceType ← CORE IP
├── tests/
│   └── test_accuracy.py         Accuracy test suite
├── pyproject.toml               PyPI config
├── setup.py                     PyPI config
└── README.md

What to Copyright

File What is original
tracer/instrument.pyTraceInserter AST NodeTransformer for execution recording
tracer/instrument.pyrun_instrumented() Subprocess isolation runner with JSON protocol
differ/divergence.pyDifferentialAnalyser._walk() Two-pointer differential trace walk
differ/divergence.pyDivergenceType Six-category flakiness classification schema
differ/divergence.py_score() Confidence scoring formula
engine.pyFlakeMark._collect_traces() Concurrent dual-run orchestration

License

MIT License — Copyright (c) 2026 Khushdeep Sharma. All rights reserved.

See LICENSE for details.


FLAKEMARK — Find the line. Fix the test.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_flakemark-1.1.0.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytest_flakemark-1.1.0-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file pytest_flakemark-1.1.0.tar.gz.

File metadata

  • Download URL: pytest_flakemark-1.1.0.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for pytest_flakemark-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d5e7efb26f6382f0ecef388c557093266516b77d7832e873b0f840f0218ce5cd
MD5 0a0e8c1a37ce0b397c8dc212c3d095e7
BLAKE2b-256 5447b9a37c1a04f6bb2713346be7b1474ec36352a070f6aa9e512128ed805948

See more details on using hashes here.

File details

Details for the file pytest_flakemark-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pytest_flakemark-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 697ebfcf6b99d7251da4027a05f913f62d6c9af1b48fb333f4eb7ae094611875
MD5 a0b8dc52ff71f1ca1395c69da8fe8bce
BLAKE2b-256 5f9cfdbe9d10ea04d4a00b3a47926f02cd781422253169e4a10fed220cc60718

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page