Skip to main content

Advanced Python cache system with multi-tenant support, scope hierarchies, and configurable eviction policies

Project description

Hierarchical Multi-Tenant Cache System

An advanced Python cache system with support for scope hierarchies, configurable eviction policies, thread-safe operations, and cache stampede protection.

๐ŸŽฏ Project Concept

This project implements an enterprise cache system that solves common problems in multi-tenant applications:

  • Data isolation between different organizations/users
  • Intelligent memory management with eviction policies
  • Cache stampede protection in concurrent environments
  • Storage flexibility through pluggable interfaces
  • Observability with detailed metrics

Key Features

โœ… Multi-Tenant: Automatic isolation by organization, user, session, etc.
โœ… Thread-Safe: Atomic operations with granular locks
โœ… Async/Sync: Complete support for synchronous and asynchronous code
โœ… Stampede Protection: Prevents unnecessary recalculations in high concurrency
โœ… Eviction Policies: LRU implemented, extensible to LFU, FIFO, etc.
โœ… Flexible TTL: Automatic expiration with background cleanup
โœ… Smart Decorator: Transparent caching for functions
โœ… Observability: Metrics for hits, misses, evictions

๐Ÿ—๏ธ Architecture

The system follows the Dependency Injection pattern with well-defined interfaces:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚     Cache       โ”‚โ”€โ”€โ”€โ”€โ”‚  PolicyManager   โ”‚โ”€โ”€โ”€โ”€โ”‚ EvictionPolicy  โ”‚
โ”‚  (Orchestrator) โ”‚    โ”‚   (Lifecycle)    โ”‚    โ”‚     (LRU)       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ StorageProvider โ”‚    โ”‚   ScopeConfig    โ”‚
โ”‚   (InMemory)    โ”‚    โ”‚  (Hierarchies)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Main Components

1. Cache (Central Orchestrator)

  • Coordinates all operations
  • Manages stampede protection locks
  • Maintains statistics (hits/misses/evictions)
  • Delegates storage and policies

2. Storage Providers

  • InMemory: In-memory storage with per-key locks
  • Extensible interface for Redis, Memcached, etc.

3. Eviction Policies

  • LRU: Least Recently Used implemented
  • Interface for LFU, FIFO, TTL-based, etc.

4. Scope Configuration

  • Flexible hierarchies: global โ†’ org โ†’ user โ†’ session
  • Multiple independent trees
  • Automatic parameter validation

5. Policy Manager

  • Manages background cleanup
  • Coordinates global and per-namespace limits
  • Daemon thread for automatic expiration

๐Ÿ“Š Data Structures

Fundamental Types

# Internal cache key: (scope_path, namespace, args_hash)
_CacheKey = Tuple[str, str, Tuple[Any, ...]]

# Stored value: (data, expiry_timestamp)
_CacheValue = Tuple[Any, float]

# Scope: string or tuple of levels
_CacheScope = Union[str, Tuple[str, ...]]

Internal Key Examples

# Global cache
("global", "expensive_function", (b"pickled_args...",))

# Organization cache
("organization:org_123", "user_data", (b"pickled_args...",))

# User cache
("organization:org_123/user:user_456", "preferences", (b"pickled_args...",))

# Programmatic cache
("organization:org_123", "__programmatic__", ("user_settings",))

Scope Hierarchy

# Example configuration
scope_config = ScopeConfig([
    ScopeLevel("organization", "org_id", [
        ScopeLevel("user", "user_id", [
            ScopeLevel("session", "session_id")
        ])
    ]),
    ScopeLevel("tenant", "tenant_id", [
        ScopeLevel("project", "project_id")
    ])
])

# Results in paths like:
# "global"
# "organization:123"
# "organization:123/user:456"
# "organization:123/user:456/session:789"
# "tenant:abc/project:xyz"

๐Ÿš€ Basic Usage

Initial Setup

from cache import create_cache
from storages.in_memory import InMemory
from eviction_policies.lre_eviction import LRUEvictionPolicy
from cache_policies.cache_policy_manager import CachePolicyManager
from cache_scopes.scope_config import ScopeConfig, ScopeLevel

# Configure scope hierarchy
scope_config = ScopeConfig([
    ScopeLevel("organization", "org_id", [
        ScopeLevel("user", "user_id")
    ])
])

# Create components
storage = InMemory()
eviction_policy = LRUEvictionPolicy()
policy_manager = CachePolicyManager(
    cache_instance=None,  # Will be set automatically
    cleanup_interval=60,  # Cleanup every 60 seconds
    policy=eviction_policy,
    max_size=10000  # Maximum 10k items globally
)

# Create and configure cache
cache = create_cache(
    backend=storage,
    policy_manager=policy_manager,
    scope_config=scope_config
)

Cache with Decorator

# Simple global cache
@cache.cache(ttl_seconds=300, scope="global")
def expensive_calculation(x, y):
    time.sleep(2)  # Simulate expensive operation
    return x * y + random.random()

# User-scoped cache
@cache.cache(ttl_seconds=600, scope="user", max_items=100)
def get_user_preferences(user_data, organization_id=None, user_id=None):
    # organization_id and user_id are automatically extracted for scope
    return fetch_preferences_from_db(user_data)

# Async cache
@cache.cache(ttl_seconds=180, scope="organization")
async def fetch_org_data(org_slug, organization_id=None):
    async with httpx.AsyncClient() as client:
        response = await client.get(f"/api/orgs/{org_slug}")
        return response.json()

Programmatic Cache

# Synchronous operations
cache.set("user_settings", {"theme": "dark"}, 3600, 
          scope="user", organization_id="org_123", user_id="user_456")

settings = cache.get("user_settings", 
                    scope="user", organization_id="org_123", user_id="user_456")

# Asynchronous operations
await cache.aset("session_data", {"cart": []}, 1800,
                scope="session", organization_id="org_123", 
                user_id="user_456", session_id="sess_789")

data = await cache.aget("session_data",
                       scope="session", organization_id="org_123",
                       user_id="user_456", session_id="sess_789")

Scope-based Eviction

# Remove all data from an organization
evicted_count = cache.evict_by_scope("organization", organization_id="org_123")

# Remove data from a specific user
cache.evict_by_scope("user", organization_id="org_123", user_id="user_456")

# Clear everything
cache.clear()

๐Ÿ“ˆ Observability

Available Metrics

stats = cache.stats()
print(stats)
# {
#     "hits": 1250,
#     "misses": 180,
#     "evictions": 45,
#     "current_size": 8934,
#     "tracked_namespaces": 23,
#     "total_calc_locks": 12
# }

# Hit rate
hit_rate = stats["hits"] / (stats["hits"] + stats["misses"])
print(f"Hit Rate: {hit_rate:.2%}")

Structured Logs

import logging
logging.basicConfig(level=logging.INFO)

# The system generates logs for:
# - Cache hits/misses
# - Automatic evictions
# - Background cleanup
# - Serialization errors
# - Configuration operations

๐Ÿ”ง Advanced Configuration

Custom Limits

# Global limit
policy_manager = CachePolicyManager(
    cache_instance=cache,
    cleanup_interval=30,
    policy=LRUEvictionPolicy(),
    max_size=50000  # Maximum 50k items
)

# Per-function/namespace limit
@cache.cache(ttl_seconds=300, scope="global", max_items=1000)
def limited_function():
    pass

Multiple Hierarchies

# Support for multiple independent scope trees
scope_config = ScopeConfig([
    # User tree
    ScopeLevel("organization", "org_id", [
        ScopeLevel("user", "user_id", [
            ScopeLevel("session", "session_id")
        ])
    ]),
    # Resource tree
    ScopeLevel("tenant", "tenant_id", [
        ScopeLevel("project", "project_id", [
            ScopeLevel("environment", "env_id")
        ])
    ])
])

Custom Serialization

# The system uses pickle by default, but can be extended
class CustomCache(Cache):
    def _make_args_key(self, *args, **kwargs):
        # Custom implementation for specific types
        try:
            return super()._make_args_key(*args, **kwargs)
        except TypeError:
            # Fallback for non-pickleable types
            return (str(args), str(sorted(kwargs.items())))

๐Ÿงช Tests

The project includes a complete test suite:

# Run all tests
make test

# Specific tests
make test-unit          # Unit tests only
make test-integration   # Integration tests
make test-performance   # Performance tests

# With coverage
make test-coverage

Test Structure

  • Unit Tests: Each component in isolation
  • Integration Tests: Complete system working together
  • Performance Tests: Benchmarks and stress tests
  • Concurrency Tests: Thread safety and race conditions

๐Ÿ”’ Thread Safety

Synchronization Strategies

  1. Per-Key Locks: Each key has its own lock
  2. Instance Lock: For global operations (clear, stats)
  3. Calculation Locks: Stampede protection per function
  4. Atomic Operations: Thread-safe storage providers

Stampede Protection

# Multiple threads calling the same function simultaneously
# Only one executes, others wait for the result

@cache.cache(ttl_seconds=300)
def expensive_api_call(endpoint):
    # Only one thread will execute this at a time for the same endpoint
    return requests.get(endpoint).json()

# 100 simultaneous threads = 1 API call
results = await asyncio.gather(*[
    expensive_api_call("/api/data") for _ in range(100)
])

๐Ÿš€ Performance

Typical Benchmarks

  • Basic operations: >10,000 ops/sec
  • Concurrent operations: >5,000 ops/sec
  • Memory usage: <500MB for 50k items
  • Latency: <1ms for hits, <10ms for misses

Optimizations

  • __slots__ for memory efficiency
  • Granular locks for high concurrency
  • Optimized serialization with pickle
  • Non-blocking background cleanup

๐Ÿ”Œ Extensibility

New Storage Providers

class RedisStorage(IStorageProvider):
    def __init__(self, redis_client):
        self.redis = redis_client
    
    def get(self, key: _CacheKey) -> Optional[_CacheValue]:
        # Redis implementation
        pass

New Eviction Policies

class LFUEvictionPolicy(IEvictionPolicy):
    def notify_set(self, key, namespace, max_items, global_max_size):
        # Least Frequently Used implementation
        pass

๐Ÿ“ License

This project is open source and available under the MIT license.

๐Ÿค Contributing

  1. Fork the project
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Guidelines

  • Maintain test coverage >95%
  • Follow existing code conventions
  • Add documentation for new features
  • Run make check-all before submitting

๐Ÿ“š Additional Documentation


Developed with โค๏ธ for high-performance Python applications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cacheado-1.0.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cacheado-1.0.0-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

File details

Details for the file cacheado-1.0.0.tar.gz.

File metadata

  • Download URL: cacheado-1.0.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for cacheado-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d4cc8ced0866fec10d0ad05265edfc8cdb5a06b3e91abfd43e1c7b7120a94d91
MD5 87338e572a2eab6c7a0e527df8e082cb
BLAKE2b-256 fe2e6a75c7753bb8b763a8cc4f45fb199dd65ab42c9e319791c7519279d46c32

See more details on using hashes here.

File details

Details for the file cacheado-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cacheado-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for cacheado-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 af1a72ccfcaaef7a7d54941ae67e10da16164f6dad45463047007880793db28f
MD5 b4755996fe0866822e5356d706b0bd00
BLAKE2b-256 67610651439c2c831930896a0ddb4fcaf7eaebec1d7de6d3a78c343b93837aea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page