Skip to main content

A heuristic-based zero-overhead thread race condition detector for Python.

Project description

Raceguard

PyPI version Python Versions License: MIT Typing: Strict

Detect real data races in your code before they become production bugs.

View Live Showcase & Docs ↗

Raceguard is a runtime concurrency safety tool that observes your program execution and flags unsafe memory access patterns across threads and async tasks, without requiring compiler support or complex setup.


The Problem

Concurrency bugs are some of the hardest issues to detect and fix.

They are:

  • Non-deterministic: Bugs appear randomly and are hard to pin down.
  • Invisible: They often hide until high-traffic production environments.
  • Corrupting: They cause silent data corruption that is painful to debug.

Most developers only discover race conditions after something breaks. Existing tools are often too complex, slow, or invasive for everyday workflows.


What Raceguard Does

Raceguard watches your shared objects as they are accessed and detects:

  • Concurrent writes to the same memory space.
  • Read/Write conflicts across threads or async flows.
  • Unsafe shared state access without proper synchronization.

It surfaces these issues immediately with clear, actionable output.


Quick Example

Problematic code

import threading

# A shared list that multiple threads will update
counter = []

def increment():
    for _ in range(1000):
        counter.append(1)

threads = [threading.Thread(target=increment) for _ in range(10)]
for t in threads: t.start()
for t in threads: t.join()

Protected with Raceguard

from raceguard import protect, locked

# Just wrap your shared object
counter = protect([])

def increment():
    for _ in range(1000):
        # Access safely via context manager
        with locked(counter):
            counter.append(1)

# ... rest of the code ...

If you forget the with locked(counter): block, Raceguard will instantly throw a RaceConditionError with a full report.


Why Raceguard Is Different

Raceguard is designed for real developer workflows, not just theory.

  • High Performance: Uses lazy frame capture, avoiding expensive stack inspection overhead until absolutely necessary.
  • Flexible Detection: Native support for raise, warn, and log modes to fit your testing strategy.
  • Zero Production Overhead: Set RACEGUARD_ENABLED=0 to completely bypass the proxy in live environments.
  • Async-Aware: Seamlessly tracks races between mixed asyncio tasks and standard threads.
  • Transactional Consistency: Uses AtomicGroup to enforce logic invariants across multiple objects, preventing "Semantic Races."
  • Deep Protection: Automatically proxies nested mutable structures, including full interception of Python's dunder methods and context managers.
  • Rich Reports: Tells you exactly which threads accessed the object, at what time, and where to fix it.

How It Works (Simple Mental Model)

Think of Raceguard as a Synchronization Observer.

  1. Wrap: You wrap a shared object with protect().
  2. Track: It records the identity of every thread or task that touches the object.
  3. Validate: It checks if a lock is held when the same memory is accessed.
  4. Report: If two threads touch the same data too quickly without a lock, it flags the conflict.

Installation

pip install raceguard

Deployment & Usage

Typical usage patterns:

  • Development — Run with configure(mode="raise") (or "warn", "log") to catch the obvious cases fast with immediate feedback during local testing.
  • Continuous Integration — Use configure(strict=True) in CI for correctness assertions. Heuristic mode (race_window) depends on timing, which varies under CPU load. Strict mode is the right tool for CI: it flags any lockless write from a different thread, regardless of elapsed time.
  • Production — Set RACEGUARD_ENABLED=0 for a true zero-cost passthrough of your original objects.

Heuristic vs. Strict — the key distinction: The default race_window of 10ms catches overlapping accesses quickly, but in a highly loaded system two logically racy writes could be far apart in wall time and slip through. Strict mode removes this ambiguity entirely — if no lock was used, it's a race.


Usage Patterns

import threading
from raceguard import protect, with_lock, locked

# 1. Protect a shared mutable object
shared_list = protect([])

# 2. Access unsafely (Will throw RaceConditionError if races occur)
def unsafe_worker():
    shared_list.append(1) 

# 3. Access Safely via Context Manager
def safe_worker_ctx():
    with locked(shared_list):
        shared_list.append(1)

# 4. Access Safely via Decorator
@with_lock(shared_list)
def safe_worker_dec():
    shared_list.append(1)

# 5. Lock multiple proxies atomically (consistent ordering prevents deadlocks)
a = protect([])
b = protect({})
with locked(a, b):
    a.append(1)
    b["x"] = 1

# 6. Group objects for transactional safety (Automatic semantic race detection)
from raceguard import AtomicGroup
group = AtomicGroup(a, b)

# This is safe
with locked(group):
    a.append(2)

# This triggers a RaceConditionError if another thread holds the group lock
# even if the individual lock for 'a' is free!
_ = a[0]

Supported Object Types

Raceguard can wrap any mutable Python object:

protect([])            # list
protect({})            # dict
protect(set())         # set
protect(bytearray())   # bytearray
protect(MyClass())     # any custom object
protect(Value(0))      # scalar via Value wrapper

protect() is idempotent

Wrapping an already-protected object returns the same proxy — no double-wrapping:

p1 = protect(my_list)
p2 = protect(p1)   # same proxy as p1
assert p1 is p2    # True

Concurrent Reads Are Safe

Two threads reading simultaneously do not trigger a race. Only write/write or read/write conflicts are flagged:

shared = protect({"val": 42})

# Both threads reading at the same time — no RaceConditionError
def reader():
    _ = shared["val"]

Advanced Features

Automatic Nested Protection

Raceguard automatically protects child objects. You don't need to manually wrap every nested dictionary or list in your state tree.

from raceguard import protect

# Wrap the parent object once
state = protect({"users": ["Alice", "Bob"]})

# The child list is automatically protected when accessed!
state["users"].append("Charlie")

Iterator Race Detection

Raceguard catches writes that happen while another thread is mid-iteration:

shared = protect([1, 2, 3])

def slow_reader():
    for item in shared:
        time.sleep(0.05)   # still iterating...

def writer():
    time.sleep(0.02)
    shared.append(4)       # RaceConditionError — write during iteration!

Actionable Error Reports

When a race condition occurs, Raceguard tells you exactly what went wrong, including the specific Thread IDs and Async Task names involved.

RaceConditionError: Concurrent access detected on object <list> at 0x...
Thread-1 (ID: 12345) wrote to object at 10:05:01.001
Thread-2 (ID: 67890) accessed object at 10:05:01.003
Location: mymodule.py:42 in worker()
Missing synchronization lock during access.

Asyncio & Threading Support

Raceguard safely tracks state even in hybrid architectures where standard threads and asyncio event loops are running simultaneously and modifying the same objects.

Strict Mode — Catching Temporally Distant Unsynchronized Writes

By default, Raceguard flags accesses within a time window. With strict=True, any lockless write from a different thread is flagged, even if it happens much later:

from raceguard import protect, configure, Value

configure(strict=True)
shared = protect(Value("initial"))

def thread1():
    shared.value = "written by T1"  # First write

def thread2():
    time.sleep(0.5)                 # Waits well beyond the race window...
    shared.value = "written by T2"  # Still caught! No lock was used.

Tip: In strict mode, use reset(shared) to manually clear access history when threads coordinate via a non-lock mechanism like a queue.Queue.

from raceguard import reset

def stage2():
    result = my_queue.get()   # synchronized via Queue
    reset(shared)             # tell Raceguard this is a fresh access point
    shared.value = result     # safe — no false positive

AtomicGroups — Enforcing Logical Transactions

When multiple objects must stay in sync (e.g., Account A and B), individual locks are not enough. If Thread 1 is moving money from A to B, Thread 2 should not be allowed to read either A or B until the transaction is complete.

AtomicGroup creates a shared safety boundary:

from raceguard import protect, AtomicGroup, locked

acc_a = protect(Account(100))
acc_b = protect(Account(0))
bank = AtomicGroup(acc_a, acc_b)

def transfer(amount):
    with locked(bank):
        acc_a.balance -= amount
        acc_b.balance += amount

def audit():
    # Attempting to read acc_a while transfer() is running 
    # will trigger a RaceConditionError!
    total = acc_a.balance + acc_b.balance

Cross-Platform Verified

Fully supported and tested across:

  • Windows
  • Linux
  • macOS

Environment Variables

Configure Raceguard without modifying code. Useful for CI/CD pipelines and deployment scripts.

Variable Default Description
RACEGUARD_ENABLED 1 Set to 0 to completely disable detection (zero overhead).
RACEGUARD_MODE raise Detection mode: raise, warn, or log.
RACEGUARD_STRICT 0 Set to 1 to flag any unsynchronized access regardless of timing.
RACEGUARD_WINDOW 0.01 Time window (seconds) within which concurrent accesses are flagged.

Full configure() Reference

from raceguard import configure

configure(
    enabled=True,        # Toggle detection on/off at runtime
    mode="raise",        # "raise" | "warn" | "log"
    strict=False,        # Bypass timing heuristic, flag all unsynchronized access
    race_window=0.01,    # Seconds — the sensitivity window for detecting races
    max_warnings=1000,   # Cap collected warnings in "warn" mode to prevent flooding
)

Protecting Scalar Values

Use Value() to protect simple types like int, float, or str that cannot be proxied directly.

Value exposes three access patterns — use whichever fits your style:

from raceguard import protect, Value, locked

counter = protect(Value(0))

def worker():
    with locked(counter):
        counter.value += 1   # attribute access
        counter.set(5)       # setter method
        x = counter.get()    # getter method

Utility Functions

from raceguard import (
    get_config,       # Returns the current configuration dict
    clear_warnings,   # Returns and clears all collected RaceConditionWarning objects
    warnings,         # Direct access to the list of collected warnings
    reset,            # Resets library state (useful between test runs)
    unbind,           # Unwraps a proxy to retrieve the raw underlying object
)

# Example: Inspect warnings after a test run
from raceguard import configure, clear_warnings

configure(mode="warn")
# ... run concurrent code ...
collected = clear_warnings()
for w in collected:
    print(w)

# Example: Get the raw object for identity checks or serialization
from raceguard import protect, unbind

data = protect({"key": "value"})
raw = unbind(data)  # Returns the original dict

Dev-Mode Overhead

In production, there are two ways to disable Raceguard. Both act as a completely transparent kill-switch that bypasses proxy creation entirely and returns your raw object directly, ensuring absolutely zero overhead at runtime:

  1. Outside your code (Recommended): Run your app with the environment variable RACEGUARD_ENABLED=0.
  2. Inside your code: Call configure(enabled=False) at the very start of your application. (Note: This must be called before any objects are wrapped. It does not retroactively remove the proxy from objects that are already protected.)

In development mode, every attribute access on a protected object passes through the proxy layer, which performs a thread-identity check and a timestamp comparison. This is intentionally lightweight, but it is not free.

As a rough guide:

Access frequency Expected impact
Occasional (locks, shared status flags) Negligible — use freely
Moderate (per-request shared state) Minimal — order of microseconds per access
Tight hot loop (millions/sec) Measurable — consider wrapping only during test runs, not benchmarks

Lazy frame capture means stack traces are only resolved when a race is actually detected, keeping the common (no-race) path as fast as possible. If you are profiling performance of concurrent code, run with RACEGUARD_ENABLED=0 to eliminate all proxy overhead.


Known Limitations & Blindspots

While Raceguard is highly effective for hunting in-memory thread races, there are fundamental "True Blindspots" governed by the physical and logical limits of high-level proxying:

  1. Direct Memory Manipulation (Ghost Writes): Raceguard relies on Python's __setattr__ and __getattribute__ hooks. It cannot see memory changes made via:
    • C-Extensions: Libraries like numpy or lxml that write directly to C-level pointers.
    • Buffer Access: Using ctypes or mmap to modify memory addresses directly.
  2. Unprotected Semantic Invariants: While AtomicGroup helps detect races on multi-object transactions, it only works if the developer correctly groups the relevant objects. Logic races on hidden or non-proxied state (like a global internal C-level counter) remain invisible.
  3. OS External State (TOCTOU): It cannot detect races between the Python process and the Operating System. For example, a "Time-of-Check to Time-of-Use" race on the file system (checking a file exists before opening it) is outside Raceguard's scope.
  4. Inter-Process Contention: Raceguard's tracking is local to the current process. It cannot detect races between two completely separate program instances (e.g., two different scripts racing for a database record).
  5. Per-Interpreter Shared Memory (Python 3.12+): With PEP 684, multiple interpreters can have their own GIL. If they share a raw memory buffer, they can have true parallel data races that bypass the interpreter-local proxy.
  6. Intentional Observer Blindspots: To prevent recursion and "Heisenbugs," the library intentionally ignores metadata calls like repr(), str(), id(), and type().

We recommend using Raceguard as a Heuristic Safety Net for application logic. For hardware-level or kernel-level verification, consider low-level tools like ThreadSanitizer, Helgrind, or eBPF.


Author

Developed by Chukwunwike Obodo.


License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raceguard-0.2.1.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raceguard-0.2.1-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file raceguard-0.2.1.tar.gz.

File metadata

  • Download URL: raceguard-0.2.1.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for raceguard-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c0115c38ef3805f4b600a17dce95e59c785059523c6c81c6088fb4c5492045f5
MD5 f96220d9b8606687708c8c755ea1549d
BLAKE2b-256 81a76416c3fb0196f50e9ee87b4bbd05c918746fc54eeb1eec42d246cf0cb768

See more details on using hashes here.

File details

Details for the file raceguard-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: raceguard-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for raceguard-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f1896ecb07f33d57faa5ffbaf5e160915ef98c9fb9cad596fb3bd6d9799111f9
MD5 9f1ab8808a263636a06c1d4207669738
BLAKE2b-256 6f9569f1f10609ea7ecd8148fae7b91de7791fcde6402a802d5c6e9f164c0516

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page