Skip to main content

A lightweight Python framework for monitoring and imitating function behavior with automatic I/O tracking and pattern learning

Project description

Imitator

PyPI version Python 3.8+ License: MIT

A lightweight Python framework for monitoring and imitating function behavior with automatic I/O tracking and pattern learning. Perfect for collecting training data for machine learning models, debugging, performance analysis, and understanding function behavior in production systems with future capabilities for behavior imitation.

✨ Features

  • 🎯 Simple Decorator: Just add @monitor_function to any function
  • 📊 Type Validation: Uses Pydantic models for robust type handling
  • 💾 Flexible Storage: Local JSON/JSONL files with configurable backends
  • Performance Monitoring: Tracks execution times and performance metrics
  • 🚨 Error Handling: Captures and logs exceptions with full context
  • 🔄 Async Support: Full support for asynchronous functions
  • 📈 Sampling & Rate Limiting: Control overhead with smart sampling
  • 🏗️ Class Method Support: Monitor class methods with proper handling
  • 🔍 Modification Detection: Detect in-place parameter modifications
  • 🪶 Minimal Dependencies: Only requires Pydantic (≥2.0.0)

🚀 Installation

pip install imitator

Requirements: Python 3.8+, Pydantic ≥2.0.0

⚡ Quick Start

from imitator import monitor_function

@monitor_function
def add_numbers(a: int, b: int) -> int:
    return a + b

# Use the function normally
result = add_numbers(5, 3)  # Automatically logged!

That's it! Your function calls are now being monitored and logged automatically.

📖 Usage Examples

Basic Function Monitoring

from imitator import monitor_function
from typing import List, Dict

@monitor_function
def process_data(data: List[float], multiplier: float = 1.0) -> Dict[str, float]:
    """Process a list of numbers and return statistics"""
    if not data:
        return {"mean": 0.0, "sum": 0.0, "count": 0}
    
    total = sum(x * multiplier for x in data)
    mean = total / len(data)
    
    return {
        "mean": mean,
        "sum": total,
        "count": len(data)
    }

# Function calls are automatically logged
result = process_data([1.0, 2.0, 3.0], 2.0)

Advanced Configuration

from imitator import monitor_function, LocalStorage, FunctionMonitor

# Custom storage location
custom_storage = LocalStorage(log_dir="my_logs", format="json")

@monitor_function(storage=custom_storage)
def my_function(x: int) -> int:
    return x * 2

# Rate limiting and sampling for high-frequency functions
monitor = FunctionMonitor(
    sampling_rate=0.1,  # Log 10% of calls
    max_calls_per_minute=100  # Max 100 calls per minute
)

@monitor.monitor
def high_frequency_function(x: int) -> int:
    return x ** 2

Async Function Support

import asyncio
from imitator import monitor_function

@monitor_function
async def fetch_data(url: str) -> dict:
    """Simulate async data fetching"""
    await asyncio.sleep(0.1)
    return {"data": f"Response from {url}"}

# Async functions work seamlessly
async def main():
    result = await fetch_data("https://api.example.com")
    print(result)

asyncio.run(main())

Class Method Monitoring

from imitator import monitor_function

class DataProcessor:
    def __init__(self, name: str):
        self.name = name
    
    @monitor_function
    def process_batch(self, items: List[dict]) -> dict:
        """Process a batch of items"""
        processed = []
        for item in items:
            processed.append(self.process_item(item))
        return {"processed": len(processed), "results": processed}
    
    def process_item(self, item: dict) -> dict:
        # Helper method (not monitored)
        return {"id": item.get("id"), "processed_by": self.name}

processor = DataProcessor("BatchProcessor")
result = processor.process_batch([{"id": 1}, {"id": 2}])

Examining Logged Data

from imitator import LocalStorage

storage = LocalStorage()

# Get all monitored functions
functions = storage.get_all_functions()
print(f"Monitored functions: {functions}")

# Load calls for a specific function
calls = storage.load_calls("add_numbers")

for call in calls:
    print(f"Input: {call.io_record.inputs}")
    print(f"Output: {call.io_record.output}")
    print(f"Execution time: {call.io_record.execution_time_ms}ms")
    print(f"Timestamp: {call.io_record.timestamp}")

📊 Data Structure

The framework captures comprehensive information about each function call:

class FunctionCall(BaseModel):
    function_signature: FunctionSignature  # Function name, parameters, return type
    io_record: IORecord                    # Inputs, output, timestamp, execution time
    call_id: str                          # Unique identifier

class IORecord(BaseModel):
    inputs: Dict[str, Any]                # Function input parameters
    output: Any                           # Function return value
    timestamp: str                        # ISO format timestamp
    execution_time_ms: float              # Execution time in milliseconds
    input_modifications: Optional[Dict]    # Detected in-place modifications

💾 Storage Format

Logs are stored as JSON/JSONL files in the logs/ directory:

logs/
├── add_numbers_20241201.jsonl
├── process_data_20241201.jsonl
└── ...

Each log entry contains:

  • Function signature with type annotations
  • Input parameters with actual values
  • Output value or exception details
  • Execution time in milliseconds
  • Timestamp in ISO format
  • Error information (if applicable)
  • Input modifications (if detected)

🚨 Error Handling

The framework gracefully handles and logs exceptions:

@monitor_function
def divide_numbers(a: float, b: float) -> float:
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

try:
    result = divide_numbers(10, 0)
except ValueError:
    pass  # Exception is logged with input parameters and full traceback

Exception logs include:

  • Input parameters that caused the error
  • Exception type and message
  • Full traceback for debugging
  • Execution time until the exception occurred

🏗️ Framework Components

Core Components

  • monitor_function: Main decorator for function monitoring
  • FunctionMonitor: Advanced monitoring with configuration options
  • FunctionCall: Pydantic model for complete function call records
  • IORecord: Pydantic model for input/output pairs
  • LocalStorage: Local file-based storage backend

Type System

The framework uses Pydantic for:

  • Runtime type validation and serialization
  • Automatic schema generation from function signatures
  • Type-safe data structures with validation
  • JSON serialization with complex type support

📋 Example Output

Running monitored functions generates structured logs like:

{
  "function_signature": {
    "name": "add_numbers",
    "parameters": {
      "a": "<class 'int'>",
      "b": "<class 'int'>"
    },
    "return_type": "<class 'int'>"
  },
  "io_record": {
    "inputs": {
      "a": 5,
      "b": 3
    },
    "output": 8,
    "timestamp": "2024-01-15T10:30:45.123456",
    "execution_time_ms": 0.05,
    "input_modifications": null
  },
  "call_id": "1705312245.123456"
}

🎯 Use Cases

🤖 Machine Learning

  • Training Data Collection: Gather input-output pairs for model training
  • Model Inference Monitoring: Track model performance and behavior
  • Feature Engineering: Monitor data preprocessing pipelines
  • A/B Testing: Compare different model versions

🔧 Development & Debugging

  • Function Profiling: Analyze performance bottlenecks
  • Debugging: Track function calls and parameter values
  • Integration Testing: Monitor system component interactions
  • Behavior Analysis: Understand function usage patterns

📊 Production Monitoring

  • System Health: Monitor critical business functions
  • Performance Tracking: Track execution times and error rates
  • User Behavior: Analyze how functions are used in production
  • Compliance: Maintain audit trails for regulatory requirements

🔬 Research & Analysis

  • Algorithm Analysis: Study algorithm behavior with real data
  • Performance Optimization: Identify optimization opportunities
  • Data Quality: Monitor data processing pipelines
  • Experimentation: Support research and development workflows

📚 Examples

The package includes comprehensive examples demonstrating various use cases:

Available Examples

  • basic_usage.py: Getting started with core features
  • advanced_monitoring.py: Advanced configuration and async support
  • real_world_simulation.py: Practical applications and systems

Run Examples

# Install the package
pip install imitator

# Clone repository for examples (if needed)
git clone https://github.com/yourusername/imitator.git
cd imitator/examples

# Run examples
python basic_usage.py
python advanced_monitoring.py
python real_world_simulation.py

Each example demonstrates:

  • Different monitoring strategies
  • Error handling scenarios
  • Performance analysis
  • Log inspection and analysis

🚀 Getting Started

  1. Install: pip install imitator
  2. Import: from imitator import monitor_function
  3. Decorate: Add @monitor_function to your functions
  4. Run: Use your functions normally
  5. Analyze: Check the generated logs in the logs/ directory

📖 Documentation

  • Examples: Comprehensive examples in the examples/ directory
  • API Reference: Detailed docstrings in all modules
  • Type Hints: Full type annotation support
  • Error Handling: Graceful handling of edge cases

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

Apache License 2.0 - see LICENSE file for details.

Database Integration

Imitator supports streaming function call logs to various databases. The database connectors use non-blocking background operations for optimal performance.

Setting Up Local Databases

Use the provided docker-compose.db.yml file to start local database servers:

# Start all database servers
make db-start

# Stop all database servers  
make db-stop

# Test database connections
make db-test

# Clean database data
make db-clean

This will start:

  • PostgreSQL on port 5432 (user: postgres, password: password)
  • MongoDB on port 27017
  • Couchbase on port 8091 (user: admin, password: password)

Using Database Connectors

from imitator import monitor_function, DatabaseStorage, PostgreSQLConnector, MongoDBConnector, CouchbaseConnector

# PostgreSQL Example
postgres_connector = PostgreSQLConnector(
    connection_string="postgresql://postgres:password@localhost:5432/postgres",
    table_name="function_calls"
)
postgres_storage = DatabaseStorage(postgres_connector)

# MongoDB Example
mongo_connector = MongoDBConnector(
    connection_string="mongodb://localhost:27017/",
    database_name="function_monitor", 
    collection_name="calls"
)
mongo_storage = DatabaseStorage(mongo_connector)

# Couchbase Example
couchbase_connector = CouchbaseConnector(
    connection_string="couchbase://localhost?username=admin&password=password",
    bucket_name="function_bucket"
)
couchbase_storage = DatabaseStorage(couchbase_connector)

# Monitor function with database storage
@monitor_function(storage=postgres_storage)
def my_function(data):
    return process_data(data)

Database Dependencies

Install optional database dependencies as needed:

# Install all database dependencies
make db-install

# Or install individually
pip install psycopg2-binary    # PostgreSQL
pip install pymongo            # MongoDB  
pip install couchbase          # Couchbase

Connection Details

The storage connection knows how to connect through the connection strings provided to each connector:

  • PostgreSQL: postgresql://user:password@host:port/database
  • MongoDB: mongodb://host:port/
  • Couchbase: couchbase://host1,host2?username=user&password=pass

All database operations are performed in background threads to avoid blocking your application.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imitator-0.2.0.tar.gz (62.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imitator-0.2.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file imitator-0.2.0.tar.gz.

File metadata

  • Download URL: imitator-0.2.0.tar.gz
  • Upload date:
  • Size: 62.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for imitator-0.2.0.tar.gz
Algorithm Hash digest
SHA256 28607ab46bb178be2ce27be3575bc67847c4f7a2e8f03c485b9e827b281f4ea1
MD5 188e4a7f6c164f69968c1860473eeaa9
BLAKE2b-256 df124993bdd5ec64f6a8150d556bb4113079984942d9646079c6bf1524f40e50

See more details on using hashes here.

File details

Details for the file imitator-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: imitator-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for imitator-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ed5bf26fe46d270b50e123753d16203ec57088b4c6b11a6e7b2c09d1c150b0b
MD5 f077bad7b4126351b9d8fb7ce72376f6
BLAKE2b-256 6f6fc6c077b918e1a9abdbabe77afb8aab666ad0d031c9d0111ec039ec04b352

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page