A lightweight Python framework for monitoring and imitating function behavior with automatic I/O tracking and pattern learning
Project description
Imitator
A lightweight Python framework for monitoring and imitating function behavior with automatic I/O tracking and pattern learning. Perfect for collecting training data for machine learning models, debugging, performance analysis, and understanding function behavior in production systems with future capabilities for behavior imitation.
✨ Features
- 🎯 Simple Decorator: Just add
@monitor_functionto any function - 📊 Type Validation: Uses Pydantic models for robust type handling
- 💾 Flexible Storage: Local JSON/JSONL files with configurable backends
- ⚡ Performance Monitoring: Tracks execution times and performance metrics
- 🚨 Error Handling: Captures and logs exceptions with full context
- 🔄 Async Support: Full support for asynchronous functions
- 📈 Sampling & Rate Limiting: Control overhead with smart sampling
- 🏗️ Class Method Support: Monitor class methods with proper handling
- 🔍 Modification Detection: Detect in-place parameter modifications
- 🪶 Minimal Dependencies: Only requires Pydantic (≥2.0.0)
🚀 Installation
pip install imitator
Requirements: Python 3.8+, Pydantic ≥2.0.0
⚡ Quick Start
from imitator import monitor_function
@monitor_function
def add_numbers(a: int, b: int) -> int:
return a + b
# Use the function normally
result = add_numbers(5, 3) # Automatically logged!
That's it! Your function calls are now being monitored and logged automatically.
📖 Usage Examples
Basic Function Monitoring
from imitator import monitor_function
from typing import List, Dict
@monitor_function
def process_data(data: List[float], multiplier: float = 1.0) -> Dict[str, float]:
"""Process a list of numbers and return statistics"""
if not data:
return {"mean": 0.0, "sum": 0.0, "count": 0}
total = sum(x * multiplier for x in data)
mean = total / len(data)
return {
"mean": mean,
"sum": total,
"count": len(data)
}
# Function calls are automatically logged
result = process_data([1.0, 2.0, 3.0], 2.0)
Advanced Configuration
from imitator import monitor_function, LocalStorage, FunctionMonitor
# Custom storage location
custom_storage = LocalStorage(log_dir="my_logs", format="json")
@monitor_function(storage=custom_storage)
def my_function(x: int) -> int:
return x * 2
# Rate limiting and sampling for high-frequency functions
monitor = FunctionMonitor(
sampling_rate=0.1, # Log 10% of calls
max_calls_per_minute=100 # Max 100 calls per minute
)
@monitor.monitor
def high_frequency_function(x: int) -> int:
return x ** 2
Async Function Support
import asyncio
from imitator import monitor_function
@monitor_function
async def fetch_data(url: str) -> dict:
"""Simulate async data fetching"""
await asyncio.sleep(0.1)
return {"data": f"Response from {url}"}
# Async functions work seamlessly
async def main():
result = await fetch_data("https://api.example.com")
print(result)
asyncio.run(main())
Class Method Monitoring
from imitator import monitor_function
class DataProcessor:
def __init__(self, name: str):
self.name = name
@monitor_function
def process_batch(self, items: List[dict]) -> dict:
"""Process a batch of items"""
processed = []
for item in items:
processed.append(self.process_item(item))
return {"processed": len(processed), "results": processed}
def process_item(self, item: dict) -> dict:
# Helper method (not monitored)
return {"id": item.get("id"), "processed_by": self.name}
processor = DataProcessor("BatchProcessor")
result = processor.process_batch([{"id": 1}, {"id": 2}])
Examining Logged Data
from imitator import LocalStorage
storage = LocalStorage()
# Get all monitored functions
functions = storage.get_all_functions()
print(f"Monitored functions: {functions}")
# Load calls for a specific function
calls = storage.load_calls("add_numbers")
for call in calls:
print(f"Input: {call.io_record.inputs}")
print(f"Output: {call.io_record.output}")
print(f"Execution time: {call.io_record.execution_time_ms}ms")
print(f"Timestamp: {call.io_record.timestamp}")
📊 Data Structure
The framework captures comprehensive information about each function call:
class FunctionCall(BaseModel):
function_signature: FunctionSignature # Function name, parameters, return type
io_record: IORecord # Inputs, output, timestamp, execution time
call_id: str # Unique identifier
class IORecord(BaseModel):
inputs: Dict[str, Any] # Function input parameters
output: Any # Function return value
timestamp: str # ISO format timestamp
execution_time_ms: float # Execution time in milliseconds
input_modifications: Optional[Dict] # Detected in-place modifications
💾 Storage Format
Logs are stored as JSON/JSONL files in the logs/ directory:
logs/
├── add_numbers_20241201.jsonl
├── process_data_20241201.jsonl
└── ...
Each log entry contains:
- Function signature with type annotations
- Input parameters with actual values
- Output value or exception details
- Execution time in milliseconds
- Timestamp in ISO format
- Error information (if applicable)
- Input modifications (if detected)
🚨 Error Handling
The framework gracefully handles and logs exceptions:
@monitor_function
def divide_numbers(a: float, b: float) -> float:
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
try:
result = divide_numbers(10, 0)
except ValueError:
pass # Exception is logged with input parameters and full traceback
Exception logs include:
- Input parameters that caused the error
- Exception type and message
- Full traceback for debugging
- Execution time until the exception occurred
🏗️ Framework Components
Core Components
monitor_function: Main decorator for function monitoringFunctionMonitor: Advanced monitoring with configuration optionsFunctionCall: Pydantic model for complete function call recordsIORecord: Pydantic model for input/output pairsLocalStorage: Local file-based storage backend
Type System
The framework uses Pydantic for:
- Runtime type validation and serialization
- Automatic schema generation from function signatures
- Type-safe data structures with validation
- JSON serialization with complex type support
📋 Example Output
Running monitored functions generates structured logs like:
{
"function_signature": {
"name": "add_numbers",
"parameters": {
"a": "<class 'int'>",
"b": "<class 'int'>"
},
"return_type": "<class 'int'>"
},
"io_record": {
"inputs": {
"a": 5,
"b": 3
},
"output": 8,
"timestamp": "2024-01-15T10:30:45.123456",
"execution_time_ms": 0.05,
"input_modifications": null
},
"call_id": "1705312245.123456"
}
🎯 Use Cases
🤖 Machine Learning
- Training Data Collection: Gather input-output pairs for model training
- Model Inference Monitoring: Track model performance and behavior
- Feature Engineering: Monitor data preprocessing pipelines
- A/B Testing: Compare different model versions
🔧 Development & Debugging
- Function Profiling: Analyze performance bottlenecks
- Debugging: Track function calls and parameter values
- Integration Testing: Monitor system component interactions
- Behavior Analysis: Understand function usage patterns
📊 Production Monitoring
- System Health: Monitor critical business functions
- Performance Tracking: Track execution times and error rates
- User Behavior: Analyze how functions are used in production
- Compliance: Maintain audit trails for regulatory requirements
🔬 Research & Analysis
- Algorithm Analysis: Study algorithm behavior with real data
- Performance Optimization: Identify optimization opportunities
- Data Quality: Monitor data processing pipelines
- Experimentation: Support research and development workflows
📚 Examples
The package includes comprehensive examples demonstrating various use cases:
Available Examples
basic_usage.py: Getting started with core featuresadvanced_monitoring.py: Advanced configuration and async supportreal_world_simulation.py: Practical applications and systems
Run Examples
# Install the package
pip install imitator
# Clone repository for examples (if needed)
git clone https://github.com/yourusername/imitator.git
cd imitator/examples
# Run examples
python basic_usage.py
python advanced_monitoring.py
python real_world_simulation.py
Each example demonstrates:
- Different monitoring strategies
- Error handling scenarios
- Performance analysis
- Log inspection and analysis
🚀 Getting Started
- Install:
pip install imitator - Import:
from imitator import monitor_function - Decorate: Add
@monitor_functionto your functions - Run: Use your functions normally
- Analyze: Check the generated logs in the
logs/directory
📖 Documentation
- Examples: Comprehensive examples in the
examples/directory - API Reference: Detailed docstrings in all modules
- Type Hints: Full type annotation support
- Error Handling: Graceful handling of edge cases
🤝 Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
📄 License
Apache License 2.0 - see LICENSE file for details.
Database Integration
Imitator supports streaming function call logs to various databases. The database connectors use non-blocking background operations for optimal performance.
Setting Up Local Databases
Use the provided docker-compose.db.yml file to start local database servers:
# Start all database servers
make db-start
# Stop all database servers
make db-stop
# Test database connections
make db-test
# Clean database data
make db-clean
This will start:
- PostgreSQL on port 5432 (user: postgres, password: password)
- MongoDB on port 27017
- Couchbase on port 8091 (user: admin, password: password)
Using Database Connectors
from imitator import monitor_function, DatabaseStorage, PostgreSQLConnector, MongoDBConnector, CouchbaseConnector
# PostgreSQL Example
postgres_connector = PostgreSQLConnector(
connection_string="postgresql://postgres:password@localhost:5432/postgres",
table_name="function_calls"
)
postgres_storage = DatabaseStorage(postgres_connector)
# MongoDB Example
mongo_connector = MongoDBConnector(
connection_string="mongodb://localhost:27017/",
database_name="function_monitor",
collection_name="calls"
)
mongo_storage = DatabaseStorage(mongo_connector)
# Couchbase Example
couchbase_connector = CouchbaseConnector(
connection_string="couchbase://localhost?username=admin&password=password",
bucket_name="function_bucket"
)
couchbase_storage = DatabaseStorage(couchbase_connector)
# Monitor function with database storage
@monitor_function(storage=postgres_storage)
def my_function(data):
return process_data(data)
Database Dependencies
Install optional database dependencies as needed:
# Install all database dependencies
make db-install
# Or install individually
pip install psycopg2-binary # PostgreSQL
pip install pymongo # MongoDB
pip install couchbase # Couchbase
Connection Details
The storage connection knows how to connect through the connection strings provided to each connector:
- PostgreSQL:
postgresql://user:password@host:port/database - MongoDB:
mongodb://host:port/ - Couchbase:
couchbase://host1,host2?username=user&password=pass
All database operations are performed in background threads to avoid blocking your application.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imitator-0.2.0.tar.gz.
File metadata
- Download URL: imitator-0.2.0.tar.gz
- Upload date:
- Size: 62.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28607ab46bb178be2ce27be3575bc67847c4f7a2e8f03c485b9e827b281f4ea1
|
|
| MD5 |
188e4a7f6c164f69968c1860473eeaa9
|
|
| BLAKE2b-256 |
df124993bdd5ec64f6a8150d556bb4113079984942d9646079c6bf1524f40e50
|
File details
Details for the file imitator-0.2.0-py3-none-any.whl.
File metadata
- Download URL: imitator-0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ed5bf26fe46d270b50e123753d16203ec57088b4c6b11a6e7b2c09d1c150b0b
|
|
| MD5 |
f077bad7b4126351b9d8fb7ce72376f6
|
|
| BLAKE2b-256 |
6f6fc6c077b918e1a9abdbabe77afb8aab666ad0d031c9d0111ec039ec04b352
|