Skip to main content

Tracks changes in nested Python structures (dicts, lists, tuples, and objects with __dict__).

Project description

Struct Changelog

CI PyPI Python Version License

What is Struct Changelog?

Struct Changelog is a Python library that automatically tracks and records changes made to nested data structures in real-time. It provides a comprehensive audit trail for modifications to dictionaries, lists, tuples, and custom objects, making it invaluable for debugging, data validation, and maintaining data integrity.

What does it do?

  • 🔍 Automatic Change Detection: Captures every modification (additions, edits, deletions) in your data structures
  • 📊 Detailed Audit Trail: Records what changed, where it changed, and what the old/new values were
  • 🌐 Nested Structure Support: Works seamlessly with complex nested data (dicts, lists, objects)
  • 📝 JSON Serializable: All change records can be exported to JSON for logging or persistence
  • 🔄 Multiple Usage Patterns: Choose from simple context managers to full object-oriented approaches

Why is it useful?

For Debugging & Development:

  • Track exactly what changes during complex data transformations
  • Identify unexpected modifications in your data pipeline
  • Debug data corruption issues by seeing the sequence of changes

For Data Validation & Integrity:

  • Ensure data modifications follow expected patterns
  • Validate business rules by analyzing change patterns
  • Maintain data consistency across complex operations

For Auditing & Compliance:

  • Create detailed logs of all data modifications
  • Track user actions and system changes
  • Meet regulatory requirements for data change tracking

For Testing & Quality Assurance:

  • Verify that your code modifies data as expected
  • Create comprehensive test assertions about data changes
  • Debug test failures by seeing exactly what changed

Real-world Use Cases:

  • API Development: Track changes to request/response data for debugging
  • Data Processing: Monitor transformations in ETL pipelines
  • Configuration Management: Track changes to application settings
  • User Interface: Monitor state changes in complex UI components
  • Database Operations: Track changes before committing to database
  • Machine Learning: Monitor data preprocessing and feature engineering steps

How it works

Struct Changelog uses Python's context manager protocol and object introspection to automatically detect changes:

  1. Context Manager: When you use with changelog.capture(data), it creates a proxy object that wraps your original data

  2. Change Detection: Every modification (assignment, deletion, list operations) is intercepted and recorded

  3. Deep Tracking: The system recursively tracks changes in nested structures (dicts, lists, objects)

  4. Change Recording: Each change is recorded with:

    • Action: ADDED, EDITED, or REMOVED
    • Key Path: The location of the change (e.g., "user.profile.email")
    • Old Value: The original value before the change
    • New Value: The new value after the change
    • Timestamp: When the change occurred
  5. Circular Reference Protection: Automatically handles circular references to prevent infinite loops

  6. Thread Safety: Safe to use in multi-threaded environments

Installation

pip install struct-changelog

Quick Start

Basic Usage

from struct_changelog import ChangeLogManager

# Create a changelog manager
changelog = ChangeLogManager()

# Your data
data = {"user": {"name": "John", "age": 30}}

# Track changes
with changelog.capture(data) as d:
    d["user"]["name"] = "Jane"
    d["user"]["age"] = 31
    d["user"]["email"] = "jane@example.com"

# View changes
for entry in changelog.get_entries():
    print(f"{entry['action']}: {entry['key_path']} = {entry['new_value']}")

Helper Approaches

To avoid manually creating ChangeLogManager instances, you can use these helper approaches:

1. Context Manager Global (Recommended for simple use)

from struct_changelog import track_changes

data = {"config": {"debug": False}}

# Most concise approach
with track_changes(data) as (changelog, tracked_data):
    tracked_data["config"]["debug"] = True
    tracked_data["config"]["version"] = "2.0"

print(changelog.get_entries())

2. Factory Function

from struct_changelog import create_changelog

# More explicit than the original approach
changelog = create_changelog()
data = {"settings": {"theme": "light"}}

with changelog.capture(data) as d:
    d["settings"]["theme"] = "dark"

3. ChangeTracker Class (For stateful tracking)

from struct_changelog import ChangeTracker

# Object-oriented approach - useful for maintaining state
tracker = ChangeTracker()

data = {"session": {"user_id": 123}}

# Track changes
with tracker.track(data) as d:
    d["session"]["user_id"] = 456
    d["session"]["active"] = True

# Access entries
print(tracker.entries)

# Add manual entries
tracker.add(ChangeActions.ADDED, "session.notes", new_value="User logged in")

# Reset when needed
tracker.reset()

Features

  • 🔍 Automatic Change Detection: Captures ADDED, EDITED, and REMOVED changes
  • 🌐 Nested Structure Support: Works with dicts, lists, tuples, and custom objects
  • 📝 JSON Serializable: All entries can be serialized to JSON
  • 🔄 Multiple Usage Patterns: Choose the approach that fits your needs
  • 🧵 Thread Safe: Safe to use in multi-threaded environments
  • 📦 Zero Dependencies: Pure Python implementation
  • 🛡️ Circular Reference Protection: Handles complex data structures safely
  • ⚡ High Performance: Minimal overhead, optimized for production use
  • 🔧 Flexible API: Multiple ways to use the library based on your needs

Change Types

  • ADDED: New items added to the structure
  • EDITED: Existing items modified
  • REMOVED: Items removed from the structure

Examples

Example 1: API Request/Response Tracking

from struct_changelog import track_changes

# Track changes to API request data
request_data = {
    "user": {"id": 123, "name": "John"},
    "settings": {"theme": "light", "notifications": True}
}

with track_changes(request_data) as (changelog, data):
    # Simulate API processing
    data["user"]["name"] = "Jane"
    data["user"]["email"] = "jane@example.com"
    data["settings"]["theme"] = "dark"
    data["settings"]["language"] = "fr"
    data["timestamp"] = "2024-01-16T10:30:00Z"

# Log all changes for debugging
for entry in changelog.get_entries():
    print(f"API Change: {entry['action']} {entry['key_path']} = {entry['new_value']}")

Example 2: Data Pipeline Monitoring

from struct_changelog import ChangeTracker

# Track data transformations in ETL pipeline
tracker = ChangeTracker()
raw_data = {"users": [], "metadata": {"source": "csv"}}

with tracker.track(raw_data) as data:
    # Data cleaning
    data["users"] = [
        {"id": 1, "name": "John", "email": "john@example.com"},
        {"id": 2, "name": "Jane", "email": "jane@example.com"}
    ]
    
    # Data enrichment
    for user in data["users"]:
        user["status"] = "active"
        user["created_at"] = "2024-01-16"
    
    # Metadata updates
    data["metadata"]["processed_at"] = "2024-01-16T10:30:00Z"
    data["metadata"]["record_count"] = len(data["users"])

# Export changes for audit
print(tracker.to_json(indent=2))

Example 3: Configuration Management

from struct_changelog import create_changelog

# Track configuration changes
config = {
    "database": {"host": "localhost", "port": 5432},
    "cache": {"enabled": True, "ttl": 3600},
    "features": {"new_ui": False}
}

changelog = create_changelog()

with changelog.capture(config) as cfg:
    # Environment-specific changes
    cfg["database"]["host"] = "prod-db.example.com"
    cfg["database"]["port"] = 5432
    cfg["cache"]["ttl"] = 7200
    cfg["features"]["new_ui"] = True
    cfg["features"]["beta_features"] = True

# Validate changes
changes = changelog.get_entries()
assert len(changes) == 4
assert any(entry["key_path"] == "features.new_ui" for entry in changes)

Example 4: Complex Object Tracking

from struct_changelog import track_changes

class User:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        self.preferences = {}
        self.tags = []

# Track changes to custom objects
user = User("John", 30)

with track_changes(user) as (changelog, tracked_user):
    tracked_user.name = "Jane"
    tracked_user.age = 31
    tracked_user.preferences["theme"] = "dark"
    tracked_user.preferences["language"] = "fr"
    tracked_user.tags.append("premium")
    tracked_user.tags.append("verified")

# All changes are tracked
for entry in changelog.get_entries():
    print(f"User change: {entry['action']} {entry['key_path']}")

See the examples/ directory for comprehensive usage examples:

  • basic_usage.py - Basic dictionary tracking
  • nested_structures.py - Complex nested structures
  • lists_arrays.py - List and array modifications
  • objects.py - Custom object tracking
  • manual_tracking.py - Manual entry addition
  • helper_approaches.py - All helper approaches compared

API Reference

ChangeLogManager

The core class for tracking changes.

changelog = ChangeLogManager()
with changelog.capture(data) as tracked_data:
    # Modify tracked_data
    pass

Helper Functions

  • create_changelog() - Factory function for creating managers
  • track_changes(data) - Context manager that creates and manages a changelog
  • ChangeTracker - Wrapper class for object-oriented usage

Why Choose Struct Changelog?

Compared to Manual Logging

  • Automatic: No need to manually log every change
  • Comprehensive: Captures all changes, including nested modifications
  • Consistent: Standardized format for all change records
  • Error-free: Eliminates human error in change tracking

Compared to Database Triggers

  • Language Agnostic: Works with any Python data structure
  • No Database Required: Works in memory, perfect for testing
  • Flexible: Can track changes before they reach the database
  • Lightweight: No external dependencies or setup required

Compared to Version Control Systems

  • Granular: Tracks individual field changes, not just file changes
  • Real-time: Captures changes as they happen
  • In-memory: Works with runtime data, not just files
  • Structured: Provides structured data about changes

Why Not Use a Global Singleton?

While a global singleton might seem convenient, it has several drawbacks:

  • Shared State: All users share the same changelog state
  • Testing Issues: Tests can interfere with each other
  • Thread Safety: Requires careful synchronization
  • Coupling: Makes code harder to maintain and test

The helper approaches provide convenience without these issues.

License

MIT License - see LICENCE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

struct_changelog-0.2.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

struct_changelog-0.2.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file struct_changelog-0.2.0.tar.gz.

File metadata

  • Download URL: struct_changelog-0.2.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.13.5 Windows/11

File hashes

Hashes for struct_changelog-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b9f1945ecfb51419e203c5bdb2cf706d15673eb293a408efcc81da7302a43955
MD5 9c971eed8567cb4cb0c2588c42de769c
BLAKE2b-256 e6e65891e061b40a5a82ad8133d68482742e9080275572b0fb4513f363316c11

See more details on using hashes here.

File details

Details for the file struct_changelog-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: struct_changelog-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.13.5 Windows/11

File hashes

Hashes for struct_changelog-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a39b5eebd0eb9678ca7be5dd8a366f516a1dbd466dd290d9aaa733f6b9dbd82b
MD5 d9ceead2472b63515d3630505614238d
BLAKE2b-256 9bf63a18612012f40be65b51d70105ab9053a34ebc1f3ec11ca0ec4d121b20ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page