Unified httpx cache (TTL/ETag) + DuckDB mirror (raw+normalized) with SQL/LLM helpers

These details have not been verified by PyPI

Project links

Project description

⧊where (awhere)^*: cachedx

Unified HTTP caching with DuckDB mirroring and LLM helpers.

_{* ⧊where (awhere) is pronounced aware (uh-wehr).}

cachedx 🚀

Unified HTTP caching with DuckDB mirroring and LLM helpers

cachedx provides intelligent HTTP caching with automatic database mirroring, making it easy to cache API responses and query them with SQL.

Why cachedx?

Most apps repeatedly hit REST APIs and lose visibility into response data:

# Traditional approach ❌
response = await client.get("/api/users")
users = response.json()  # Data lost after processing

# With cachedx ✅
response = await cached_client.get("/api/users")  # Automatically cached
users_df = client.query("SELECT * FROM users WHERE active = true")  # Query with SQL!

Key Features

🚄 Zero-config caching - Works out of the box with sensible defaults
🔄 Dual storage - HTTP cache + normalized tables for fast queries
🧠 Auto-inference - Automatically creates schemas from JSON responses
🛡️ LLM-safe - Built-in SQL safety for LLM-generated queries
⚡ High performance - Cache hits < 1ms, powered by DuckDB
🏗️ Production ready - Comprehensive Pydantic validation throughout

Installation

# With pip
pip install cachedx

# With uv (recommended)
uv add cachedx

# With optional dependencies
pip install 'cachedx[pandas]'  # For DataFrame support
pip install 'cachedx[dev]'     # For development

Requires: Python 3.12+, DuckDB 1.0+, httpx 0.27+

Quick Start

Basic HTTP Caching

from cachedx.httpcache import CachedClient

async with CachedClient(base_url="https://api.github.com") as client:
    # First call hits API and caches response
    response = await client.get("/users/octocat")

    # Second call returns cached data (< 1ms)
    response = await client.get("/users/octocat")

    # Query cached data with SQL!
    users = client.query("SELECT * FROM users_octocat LIMIT 10")
    print(users)

Advanced Configuration

from datetime import timedelta
from cachedx.httpcache import CachedClient, CacheConfig, CacheStrategy, EndpointConfig

config = CacheConfig(
    default_ttl=timedelta(minutes=5),
    enable_logging=True,
    endpoints={
        "/api/users": EndpointConfig(
            strategy=CacheStrategy.CACHED,
            ttl=timedelta(minutes=10),
            table_name="users"
        ),
        "/api/metadata": EndpointConfig(
            strategy=CacheStrategy.STATIC  # Cache forever
        ),
        "/api/realtime/*": EndpointConfig(
            strategy=CacheStrategy.REALTIME  # Always fetch, but store
        ),
    }
)

async with CachedClient(base_url="https://api.example.com", cache_config=config) as client:
    response = await client.get("/api/users")  # Cached for 10 minutes
    df = client.query("SELECT name, email FROM users WHERE active = true")

Resource Mirroring with Auto-Inference

from cachedx.mirror import hybrid_cache, register, Mapping

# Option 1: Let cachedx infer the schema automatically
@hybrid_cache(resource="users", auto_register=True)
async def get_users(client):
    return await client.get("/api/users")

# Option 2: Define explicit schema mapping
register("forecasts", Mapping(
    table="forecasts",
    columns={
        "id": "$.id",
        "sku": "$.sku",
        "method": "$.method",
        "status": "$.status",
        "updated_at": "CAST(j->>'updated_at' AS TIMESTAMP)",
    },
    ddl="""
    CREATE TABLE forecasts (
        id TEXT PRIMARY KEY,
        sku TEXT NOT NULL,
        method TEXT,
        status TEXT,
        updated_at TIMESTAMP
    )
    """
))

@hybrid_cache(resource="forecasts")
async def get_forecasts(client):
    return await client.get("/api/forecasts")

# Use the decorated functions
await get_users(client)      # Data automatically mirrored
await get_forecasts(client)  # Uses explicit schema

# Query the mirrored data
from cachedx import safe_select
results = safe_select("""
    SELECT sku, status, updated_at
    FROM forecasts
    WHERE status = 'failed'
      AND updated_at > now() - INTERVAL 1 DAY
    ORDER BY updated_at DESC
""")

LLM Integration

from cachedx import build_llm_context, safe_llm_query

# Build context for LLM
context = build_llm_context(include_samples=True)
print(context)
# Output:
# # Database Schema and Context
# You have access to a DuckDB database with cached API responses.
# ## Available Tables (3 tables)
# ### Table: `users`
# **Columns:**
# - `id` (BIGINT, NOT NULL)
# - `name` (TEXT, NULL)
# - `email` (TEXT, NULL)
# **Sample data:**
# | id | name     | email           |
# |----|----------|-----------------|
# | 1  | Alice    | alice@example.com |

# Use with your favorite LLM
prompt = f"""
Generate a SQL query to find the top 10 most active users.

{context}
"""

# Execute LLM-generated queries safely
llm_sql = "SELECT name, COUNT(*) as activity FROM users GROUP BY name ORDER BY activity DESC LIMIT 10"
result = safe_llm_query(llm_sql)

if result["success"]:
    print(f"Found {result['row_count']} users")
    print(result["data"])  # pandas DataFrame or list of dicts
else:
    print(f"Query failed: {result['error']}")

Architecture

cachedx uses a dual storage architecture:

graph LR
    API[REST API] -->|JSON| CLIENT[CachedClient]

    CLIENT -->|Store| CACHE[(HTTP Cache<br/>TTL + ETag)]
    CLIENT -->|Mirror| TABLES[(Normalized Tables<br/>users, forecasts)]

    APP[Your App] -->|SQL| QUERY[Query Engine]
    QUERY --> CACHE
    QUERY --> TABLES

    LLM[LLM] -->|Safe SQL| QUERY

Benefits:

HTTP Cache: Fast response serving with TTL/ETag support
Normalized Tables: Structured data for complex queries and analytics
LLM Safety: Prevents dangerous operations, adds automatic LIMIT
Auto-Inference: Zero-config schema creation from JSON responses

Cache Strategies

Strategy	Behavior	Use Case
`CACHED`	Cache with TTL, supports ETag revalidation	Most API endpoints
`STATIC`	Cache forever, never expires	Metadata, configuration
`REALTIME`	Always fetch, but store for querying	Live data, real-time feeds
`DISABLED`	No caching	Debug, testing

Performance

Operation	Latency	Notes
Cache Hit	< 1ms	Served from DuckDB
Cache Miss	Network + 2ms	Store + mirror overhead
SQL Query (1K rows)	5-10ms	DuckDB performance
Auto-inference	2-5ms	Schema creation

Examples

The examples/ directory contains comprehensive demonstrations of cachedx functionality:

Running Examples

# Clone the repository
git clone https://github.com/awhereai/cachedx
cd cachedx

# Install dependencies
uv sync  # or pip install -e '.[dev]'

# Run individual examples
uv run python -m examples.simple_cache
uv run python -m examples.quickstart
uv run python -m examples.advanced_mirroring
uv run python -m examples.llm_safety_demo
uv run python -m examples.basic_demo

Example Descriptions

🚀 `basic_demo.py` - Core Features Walkthrough

What it does: Demonstrates all core cachedx features in one comprehensive example Features shown:

Automatic HTTP caching with GitHub API
View generation from cached JSON responses
SQL querying of cached data
LLM context generation for query assistance
Cache statistics and monitoring

Key takeaways: Perfect introduction to cachedx - shows HTTP caching, SQL queries, and LLM integration working together seamlessly.

⚡ `simple_cache.py` - Basic HTTP Caching

What it does: Minimal example showing basic HTTP caching functionality Features shown:

Drop-in replacement for httpx.AsyncClient
Automatic response caching and cache hits
SQL querying of cached data
Cache statistics

Key takeaways: Start here if you just need HTTP caching. Shows how cachedx works as a simple httpx wrapper.

📚 `quickstart.py` - Three-Part Comprehensive Demo

What it does: Structured walkthrough of HTTP caching, resource mirroring, and LLM helpers Features shown:

Part 1 - HTTP Cache: Basic caching with custom configurations
Part 2 - Mirror Demo: Automatic schema inference and data mirroring
Part 3 - LLM Helper: Safe query execution and context generation

Key takeaways: Best overview of all three layers working together. Great for understanding the full cachedx workflow.

🔧 `advanced_mirroring.py` - Schema Inference & Complex Mapping

What it does: Advanced resource mirroring with custom schemas and auto-inference Features shown:

Custom schema registration for GitHub repositories
Automatic mirroring with @hybrid_cache decorator
Auto-inference handling complex JSON with nested arrays
Advanced SQL queries on mirrored data
LLM context generation from multiple data sources

Key takeaways: For production usage with complex APIs. Shows both manual schema definition and auto-inference working with challenging data structures.

🛡️ `llm_safety_demo.py` - LLM Security Features

What it does: Comprehensive demonstration of SQL safety features for LLM integration Features shown:

Safe query execution (SELECT-only enforcement)
Dangerous keyword blocking (prevents DROP, DELETE, etc.)
Query validation and error handling
Automatic LIMIT injection for unbounded queries
Execution timing and metadata collection

Key takeaways: Essential for LLM applications. Shows how cachedx prevents SQL injection and dangerous operations while enabling powerful query capabilities.

Real-World Use Cases

We've created two complete, runnable example applications that demonstrate cachedx in production-ready scenarios. Each app includes both backend (FastAPI + cachedx) and frontend (React) with full setup instructions.

🌐 Use Case 1: Data Dashboard UI App

Complete Example App: examples/dashboard-ui/

Scenario: Building a React dashboard that displays user analytics from your company's REST API with intelligent caching, offline capability, and custom SQL query capabilities.

Key Features Demonstrated:

⚡ 50x faster loading (100ms vs 5+ seconds)
🔄 Offline capability with cached data
📊 Custom SQL queries from the frontend
🛡️ SQL injection protection with safety layers
🚀 Real-time updates with intelligent caching

Quick Start:

# Backend
cd examples/dashboard-ui/backend
uv sync
uv run python main.py

# Frontend (new terminal)
cd examples/dashboard-ui/frontend
npm install && npm start

Architecture Highlights:

Different caching strategies for different data types (30min for users, 10min for analytics, realtime for live metrics)
FastAPI endpoints with cachedx integration
React dashboard with SQL query builder
Automatic schema inference and data mirroring

🤖 Use Case 2: PydanticAI Support Agent

Complete Example App: examples/support-agent/

Scenario: Intelligent customer support agent using PydanticAI that accesses live company data through cachedx for accurate, context-aware responses.

Key Features Demonstrated:

🧠 AI agent with real-time data access
⚡ Sub-second responses with cached data
🛡️ Safe operations (query-only, no data modification)
📊 Rich context from multiple data sources
🔄 Critical data updates every 30 seconds
📈 Scales to thousands of concurrent users

Quick Start:

# Backend
cd examples/support-agent/backend
uv sync
export OPENAI_API_KEY="your-api-key"
uv run python main.py

# Frontend (new terminal)
cd examples/support-agent/frontend
npm install && npm start

Architecture Highlights:

PydanticAI agent with cachedx data access tools
Multi-API integration with smart caching (15min users, 2min orders, 30sec inventory)
Chat interface with confidence scoring and suggested actions
Automatic data mirroring and SQL context generation

Example Agent Conversations:

"What's the status of my recent orders?" → Agent queries orders table with user context
"Is the iPhone 15 Pro in stock?" → Agent checks real-time inventory with 30-second cache
"Show me my account information" → Agent retrieves user data with appropriate caching

Development

# Clone repository
git clone https://github.com/yourusername/cachedx
cd cachedx

# Install with uv (recommended)
uv sync
uv run python examples/quickstart.py

# Or with pip
pip install -e '.[dev]'
python examples/quickstart.py

# Run tests
uv run pytest
# or
pytest

# Type checking
uv run mypy cachedx
# or
mypy cachedx

# Linting
uv run ruff check cachedx
# or
ruff check cachedx

API Reference

Core Functions

safe_select(sql, params, limit) - Execute SELECT-only queries safely
build_llm_context() - Generate LLM context from available data
safe_llm_query(sql) - Execute LLM queries with validation and formatting

HTTP Cache Layer

CachedClient - Drop-in replacement for httpx.AsyncClient with caching
CacheConfig - Global cache configuration
EndpointConfig - Per-endpoint cache settings
CacheStrategy - Caching strategies (CACHED, STATIC, REALTIME, DISABLED)

Mirror Layer

@hybrid_cache(resource) - Decorator for automatic response mirroring
register(name, mapping) - Register explicit resource mapping
Mapping - Schema definition for JSON -> SQL transformation
infer_from_response(data, table) - Auto-infer mapping from JSON data

FAQ

Q: Why Python 3.12+? A: Modern type hints, better performance, and improved error messages.

Q: Do I need to define schemas? A: No! Auto-inference works great for most cases. Use explicit schemas for fine control.

Q: How does this compare to Redis? A: cachedx stores structured, queryable data. Redis is for key-value. Different use cases.

Q: Is it production ready? A: Yes! Comprehensive validation, type safety, and battle-tested architecture.

Q: Can I use it with my existing httpx code? A: Yes! CachedClient is a drop-in replacement for httpx.AsyncClient.

License

MIT License - see LICENSE file.

Contributing

Contributions welcome! Please read our contributing guidelines and submit pull requests.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.2

Nov 6, 2025

This version

0.2.1

Sep 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cachedx-0.2.1.tar.gz (430.3 kB view details)

Uploaded Sep 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cachedx-0.2.1-py3-none-any.whl (41.5 kB view details)

Uploaded Sep 30, 2025 Python 3

File details

Details for the file cachedx-0.2.1.tar.gz.

File metadata

Download URL: cachedx-0.2.1.tar.gz
Upload date: Sep 30, 2025
Size: 430.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for cachedx-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`77397a082e3947a2f8b2cdbd3485280aac12a8bfe76415d5adb31a2094783747`
MD5	`f0be89ffe2cf35c1a879ab498a6d892a`
BLAKE2b-256	`9e2ef8a69cf3162ff0e76a8a280123e68ea25899fc936c21c10fcb0f8530c186`

See more details on using hashes here.

File details

Details for the file cachedx-0.2.1-py3-none-any.whl.

File metadata

Download URL: cachedx-0.2.1-py3-none-any.whl
Upload date: Sep 30, 2025
Size: 41.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for cachedx-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`76b08b92f7349e76f1f97c6eb83d0a807074ed22e434626e2a269a1108dee0d2`
MD5	`b5cea75e5068247273f6a854a17bf628`
BLAKE2b-256	`db24ee6aceea341ecff1cf3e40252ba17fe6ac546333f9dcea94172044a724ec`

See more details on using hashes here.

cachedx 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

⧊where (awhere)*: cachedx

cachedx 🚀

Why cachedx?

Key Features

Installation

Quick Start

Basic HTTP Caching

Advanced Configuration

Resource Mirroring with Auto-Inference

LLM Integration

Architecture

Cache Strategies

Performance

Examples

Running Examples

Example Descriptions

🚀 basic_demo.py - Core Features Walkthrough

⚡ simple_cache.py - Basic HTTP Caching

📚 quickstart.py - Three-Part Comprehensive Demo

🔧 advanced_mirroring.py - Schema Inference & Complex Mapping

🛡️ llm_safety_demo.py - LLM Security Features

Real-World Use Cases

🌐 Use Case 1: Data Dashboard UI App

🤖 Use Case 2: PydanticAI Support Agent

Development

API Reference

Core Functions

HTTP Cache Layer

Mirror Layer

FAQ

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

⧊where (awhere)^*: cachedx

🚀 `basic_demo.py` - Core Features Walkthrough

⚡ `simple_cache.py` - Basic HTTP Caching

📚 `quickstart.py` - Three-Part Comprehensive Demo

🔧 `advanced_mirroring.py` - Schema Inference & Complex Mapping

🛡️ `llm_safety_demo.py` - LLM Security Features