Skip to main content

Chuk Artifacts provides a production-ready, modular artifact storage system that works seamlessly across multiple storage backends (memory, filesystem, AWS S3, IBM Cloud Object Storage) with Redis or memory-based metadata caching and strict session-based security.

Project description

Chuk Artifacts

Tests Python License

Asynchronous, multi-backend artifact storage with mandatory session-based security and grid architecture

Chuk Artifacts provides a production-ready, modular artifact storage system that works seamlessly across multiple storage backends (memory, filesystem, AWS S3, IBM Cloud Object Storage) with Redis or memory-based metadata caching and strict session-based security.

โœจ Key Features

  • ๐Ÿ—๏ธ Modular Architecture: 5 specialized operation modules for clean separation of concerns
  • ๐Ÿ”’ Mandatory Session Security: Strict isolation with no anonymous artifacts or cross-session operations
  • ๐ŸŒ Grid Architecture: grid/{sandbox_id}/{session_id}/{artifact_id} paths for federation-ready organization
  • ๐Ÿ”„ Multi-Backend Support: Memory, filesystem, S3, IBM COS with seamless switching
  • โšก High Performance: Built with async/await for high throughput (3,000+ ops/sec)
  • ๐Ÿ”— Presigned URLs: Secure, time-limited access without credential exposure
  • ๐Ÿ“Š Batch Operations: Efficient multi-file uploads and processing
  • ๐Ÿ—ƒ๏ธ Metadata Caching: Fast lookups with Redis or memory-based sessions
  • ๐Ÿ“ Directory-Like Operations: Organize files with path-based prefixes
  • ๐Ÿ”ง Zero Configuration: Works out of the box with sensible defaults
  • ๐ŸŒ Production Ready: Battle-tested with comprehensive error handling

๐Ÿš€ Quick Start

Installation

pip install chuk-artifacts
# or with uv
uv add chuk-artifacts

Basic Usage

from chuk_artifacts import ArtifactStore

# Zero-config setup (uses memory provider)
async with ArtifactStore() as store:
    # Store an artifact (session auto-allocated)
    artifact_id = await store.store(
        data=b"Hello, world!",
        mime="text/plain",
        summary="A simple greeting",
        filename="hello.txt"
        # session_id auto-allocated if not provided
    )
    
    # Retrieve it
    data = await store.retrieve(artifact_id)
    print(data.decode())  # "Hello, world!"
    
    # Generate a presigned URL
    download_url = await store.presign_medium(artifact_id)  # 1 hour

Session-Based File Management

async with ArtifactStore() as store:
    # Create files in user sessions
    doc_id = await store.write_file(
        content="# User's Document\n\nPrivate content here.",
        filename="docs/private.md",
        mime="text/markdown",
        session_id="user_alice"
    )
    
    # List files in a session
    files = await store.list_by_session("user_alice")
    print(f"Alice has {len(files)} files")
    
    # List directory-like contents
    docs = await store.get_directory_contents("user_alice", "docs/")
    print(f"Alice's docs: {len(docs)} files")
    
    # Copy within same session (allowed)
    backup_id = await store.copy_file(
        doc_id,
        new_filename="docs/private_backup.md"
    )
    
    # Cross-session operations are BLOCKED for security
    try:
        await store.copy_file(
            doc_id, 
            target_session_id="user_bob"  # This will fail
        )
    except ArtifactStoreError:
        print("โœ… Cross-session operations blocked!")

Configuration

# Production setup with S3 and Redis
store = ArtifactStore(
    storage_provider="s3",
    session_provider="redis",
    bucket="my-artifacts"
)

# Or use environment variables
# ARTIFACT_PROVIDER=s3
# SESSION_PROVIDER=redis
# AWS_ACCESS_KEY_ID=your_key
# AWS_SECRET_ACCESS_KEY=your_secret
# ARTIFACT_BUCKET=my-artifacts

store = ArtifactStore()  # Auto-loads configuration

๐Ÿ—๏ธ Modular Architecture

Chuk Artifacts uses a clean modular architecture with specialized operation modules:

ArtifactStore (Main Coordinator)
โ”œโ”€โ”€ CoreStorageOperations     # store() and retrieve()
โ”œโ”€โ”€ MetadataOperations        # metadata, exists, delete, update, list operations
โ”œโ”€โ”€ PresignedURLOperations    # URL generation and upload workflows  
โ”œโ”€โ”€ BatchOperations          # store_batch() for multiple files
โ””โ”€โ”€ AdminOperations          # validate_configuration, get_stats

Grid Architecture

All artifacts are organized using a consistent grid structure:

grid/{sandbox_id}/{session_id}/{artifact_id}

Benefits:

  • Federation Ready: Cross-sandbox discovery and routing
  • Session Isolation: Clear boundaries for security
  • Predictable Paths: Easy to understand and manage
  • Scalable: Handles multi-tenant applications

This design provides:

  • Better testability: Each module can be tested independently
  • Enhanced maintainability: Clear separation of concerns
  • Easy extensibility: Add new operation types without touching core
  • Improved debugging: Isolated functionality for easier troubleshooting

๐Ÿ”’ Session-Based Security

Mandatory Sessions

# Every artifact belongs to a session - no anonymous artifacts
artifact_id = await store.store(
    data=b"content",
    mime="text/plain",
    summary="description"
    # session_id auto-allocated if not provided
)

# Get the session it was allocated to
metadata = await store.metadata(artifact_id)
session_id = metadata["session_id"]

Strict Session Isolation

# Users can only access their own files
alice_files = await store.list_by_session("user_alice")
bob_files = await store.list_by_session("user_bob")

# Cross-session operations are blocked
await store.copy_file(alice_file_id, target_session_id="user_bob")  # โŒ Blocked
await store.move_file(alice_file_id, new_session_id="user_bob")     # โŒ Blocked

Multi-Tenant Safe

# Perfect for SaaS applications
company_a_files = await store.list_by_session("company_a")
company_b_files = await store.list_by_session("company_b")

# Companies cannot access each other's data
# Compliance-ready: GDPR, SOX, HIPAA

๐Ÿ“ฆ Storage Providers

Memory Provider

store = ArtifactStore(storage_provider="memory")
  • Perfect for development and testing
  • Zero configuration required
  • Non-persistent (data lost on restart)
  • Note: Provider isolation limitations for testing

Filesystem Provider

store = ArtifactStore(storage_provider="filesystem")
# Set root directory
os.environ["ARTIFACT_FS_ROOT"] = "./my-artifacts"
  • Local disk storage
  • Persistent across restarts
  • file:// URLs for local access
  • Full session listing support
  • Great for development and staging

AWS S3 Provider

store = ArtifactStore(storage_provider="s3")
# Configure via environment
os.environ.update({
    "AWS_ACCESS_KEY_ID": "your_key",
    "AWS_SECRET_ACCESS_KEY": "your_secret",
    "AWS_REGION": "us-east-1",
    "ARTIFACT_BUCKET": "my-bucket"
})
  • Industry-standard cloud storage
  • Native presigned URL support
  • Highly scalable and durable
  • Full session listing support
  • Perfect for production workloads

IBM Cloud Object Storage

# HMAC authentication
store = ArtifactStore(storage_provider="ibm_cos")
os.environ.update({
    "AWS_ACCESS_KEY_ID": "your_hmac_key",
    "AWS_SECRET_ACCESS_KEY": "your_hmac_secret",
    "IBM_COS_ENDPOINT": "https://s3.us-south.cloud-object-storage.appdomain.cloud"
})

# IAM authentication
store = ArtifactStore(storage_provider="ibm_cos_iam")
os.environ.update({
    "IBM_COS_APIKEY": "your_api_key",
    "IBM_COS_INSTANCE_CRN": "crn:v1:bluemix:public:cloud-object-storage:..."
})

๐Ÿ—ƒ๏ธ Session Providers

Memory Sessions

store = ArtifactStore(session_provider="memory")
  • In-memory metadata storage
  • Fast but non-persistent
  • Perfect for testing

Redis Sessions

store = ArtifactStore(session_provider="redis")
os.environ["SESSION_REDIS_URL"] = "redis://localhost:6379/0"
  • Persistent metadata storage
  • Shared across multiple instances
  • Production-ready caching

๐ŸŽฏ Common Use Cases

MCP Server Integration

from chuk_artifacts import ArtifactStore

# Initialize for MCP server
store = ArtifactStore(
    storage_provider="filesystem",  # or "s3" for production
    session_provider="redis"
)

# MCP tool: Upload file
async def upload_file(data_base64: str, filename: str, mime: str, session_id: str):
    data = base64.b64decode(data_base64)
    artifact_id = await store.store(
        data=data,
        mime=mime,
        summary=f"Uploaded: {filename}",
        filename=filename,
        session_id=session_id  # Session isolation
    )
    return {"artifact_id": artifact_id}

# MCP tool: List session files
async def list_session_files(session_id: str, prefix: str = ""):
    files = await store.get_directory_contents(session_id, prefix)
    return {"files": files}

# MCP tool: Copy file (within session only)
async def copy_file(artifact_id: str, new_filename: str):
    new_id = await store.copy_file(artifact_id, new_filename=new_filename)
    return {"new_artifact_id": new_id}

Web Framework Integration

from chuk_artifacts import ArtifactStore

# Initialize once at startup
store = ArtifactStore(
    storage_provider="s3",
    session_provider="redis"
)

async def upload_file(file_content: bytes, filename: str, content_type: str, user_id: str):
    """Handle file upload in FastAPI/Flask with user isolation"""
    artifact_id = await store.store(
        data=file_content,
        mime=content_type,
        summary=f"Uploaded: {filename}",
        filename=filename,
        session_id=f"user_{user_id}"  # User-specific session
    )
    
    # Return download URL
    download_url = await store.presign_medium(artifact_id)
    return {
        "artifact_id": artifact_id,
        "download_url": download_url
    }

async def list_user_files(user_id: str, directory: str = ""):
    """List files for a specific user"""
    return await store.get_directory_contents(f"user_{user_id}", directory)

Advanced File Operations

# Read file content directly
content = await store.read_file(artifact_id, as_text=True)
print(f"File content: {content}")

# Write file with content
new_id = await store.write_file(
    content="# New Document\n\nThis is a new file.",
    filename="documents/new_doc.md",
    mime="text/markdown",
    session_id="user_123"
)

# Move/rename file within session
await store.move_file(
    artifact_id,
    new_filename="documents/renamed_doc.md"
)

# Update metadata
await store.update_metadata(
    artifact_id,
    summary="Updated summary",
    meta={"version": 2, "updated_by": "user_123"}
)

# Extend TTL
await store.extend_ttl(artifact_id, additional_seconds=3600)

Batch Processing

# Prepare multiple files
items = [
    {
        "data": file1_content,
        "mime": "image/png",
        "summary": "Product image 1",
        "filename": "images/product1.png"
    },
    {
        "data": file2_content,
        "mime": "image/png", 
        "summary": "Product image 2",
        "filename": "images/product2.png"
    }
]

# Store all at once with session isolation
artifact_ids = await store.store_batch(items, session_id="product-catalog")

๐Ÿงช Testing

Run All Tests

# MCP server scenarios (recommended)
uv run examples/mcp_test_demo.py

# Session security testing
uv run examples/session_operations_demo.py

# Grid architecture demo
uv run examples/grid_demo.py

# Complete verification
uv run examples/complete_verification.py

Test Results

Recent test results show excellent performance:

๐Ÿ“ค Test 1: Rapid file creation...
โœ… Created 20 files in 0.006s (3,083 files/sec)

๐Ÿ“‹ Test 2: Session listing performance...
โœ… Listed 20 files in 0.002s

๐Ÿ“ Test 3: Directory operations...
โœ… Listed uploads/ directory (20 files) in 0.002s

๐Ÿ“– Test 4: Batch read operations...
โœ… Read 10 files in 0.002s (4,693 reads/sec)

๐Ÿ“‹ Test 5: Copy operations...
โœ… Copied 5 files in 0.003s (1,811 copies/sec)

Development Setup

from chuk_artifacts.config import development_setup

store = development_setup()  # Uses memory providers

Testing Setup

from chuk_artifacts.config import testing_setup

store = testing_setup("./test-artifacts")  # Uses filesystem

๐Ÿ”ง Configuration

Environment Variables

# Storage configuration
ARTIFACT_PROVIDER=s3              # memory, filesystem, s3, ibm_cos, ibm_cos_iam
ARTIFACT_BUCKET=my-artifacts       # Bucket/container name
ARTIFACT_FS_ROOT=./artifacts       # Filesystem root (filesystem provider)
ARTIFACT_SANDBOX_ID=my-app         # Sandbox identifier for multi-tenancy

# Session configuration  
SESSION_PROVIDER=redis             # memory, redis
SESSION_REDIS_URL=redis://localhost:6379/0

# AWS/S3 configuration
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
S3_ENDPOINT_URL=https://custom-s3.com  # Optional: custom S3 endpoint

# IBM COS configuration
IBM_COS_ENDPOINT=https://s3.us-south.cloud-object-storage.appdomain.cloud
IBM_COS_APIKEY=your_api_key        # For IAM auth
IBM_COS_INSTANCE_CRN=crn:v1:...    # For IAM auth

Programmatic Configuration

from chuk_artifacts.config import configure_s3, configure_redis_session

# Configure S3 storage
configure_s3(
    access_key="AKIA...",
    secret_key="...",
    bucket="prod-artifacts",
    region="us-west-2"
)

# Configure Redis sessions
configure_redis_session("redis://prod-redis:6379/1")

# Create store with this configuration
store = ArtifactStore()

๐Ÿš€ Performance

  • High Throughput: 3,000+ file operations per second
  • Async/Await: Non-blocking I/O for high concurrency
  • Connection Pooling: Efficient resource usage with aioboto3
  • Metadata Caching: Sub-millisecond lookups with Redis
  • Batch Operations: Reduced overhead for multiple files
  • Grid Architecture: Optimized session-based queries

Performance Benchmarks

โœ… File Creation: 3,083 files/sec
โœ… File Reading: 4,693 reads/sec  
โœ… File Copying: 1,811 copies/sec
โœ… Session Listing: ~0.002s for 20+ files
โœ… Directory Listing: ~0.002s for filtered results

๐Ÿ”’ Security

  • Mandatory Sessions: No anonymous artifacts allowed
  • Session Isolation: Strict boundaries prevent cross-session access
  • No Cross-Session Operations: Copy, move, overwrite blocked across sessions
  • Grid Architecture: Clear audit trail in paths
  • Presigned URLs: Time-limited access without credential sharing
  • Secure Defaults: Conservative TTL and expiration settings
  • Credential Isolation: Environment-based configuration
  • Error Handling: No sensitive data in logs or exceptions

Security Validation

# All these operations are blocked for security
await store.copy_file(user_a_file, target_session_id="user_b")  # โŒ Blocked
await store.move_file(user_a_file, new_session_id="user_b")     # โŒ Blocked  

# Security test results:
# โœ… Cross-session copy correctly blocked
# โœ… Cross-session move correctly blocked  
# โœ… Cross-session overwrite correctly blocked
# ๐Ÿ›ก๏ธ ALL SECURITY TESTS PASSED!

๐Ÿ“ API Reference

Core Methods

store(data, *, mime, summary, meta=None, filename=None, session_id=None, user_id=None, ttl=900)

Store artifact data with metadata. Session auto-allocated if not provided.

retrieve(artifact_id)

Retrieve artifact data by ID.

metadata(artifact_id)

Get artifact metadata.

exists(artifact_id) / delete(artifact_id)

Check existence or delete artifacts.

Session Operations

create_session(user_id=None, ttl_hours=None)

Create a new session explicitly.

validate_session(session_id) / get_session_info(session_id)

Session validation and information retrieval.

list_by_session(session_id, limit=100)

List all artifacts in a session.

get_directory_contents(session_id, directory_prefix="", limit=100)

Get files in a directory-like structure.

File Operations

write_file(content, *, filename, mime="text/plain", session_id=None, ...)

Write content to new file.

read_file(artifact_id, *, encoding="utf-8", as_text=True)

Read file content directly as text or binary.

copy_file(artifact_id, *, new_filename=None, new_meta=None, summary=None)

Copy file within same session only (cross-session blocked).

move_file(artifact_id, *, new_filename=None, new_meta=None)

Move/rename file within same session only (cross-session blocked).

list_files(session_id, prefix="", limit=100)

List files with optional prefix filtering.

Metadata Operations

update_metadata(artifact_id, *, summary=None, meta=None, merge=True, **kwargs)

Update artifact metadata.

extend_ttl(artifact_id, additional_seconds)

Extend artifact TTL.

Presigned URLs

presign(artifact_id, expires=3600)

Generate presigned URL for download.

presign_short(artifact_id) / presign_medium(artifact_id) / presign_long(artifact_id)

Generate URLs with predefined durations (15min/1hr/24hr).

presign_upload(session_id=None, filename=None, mime_type="application/octet-stream", expires=3600)

Generate presigned URL for upload.

register_uploaded_artifact(artifact_id, *, mime, summary, ...)

Register metadata for presigned uploads.

Batch Operations

store_batch(items, session_id=None, ttl=900)

Store multiple artifacts efficiently.

Admin Operations

validate_configuration()

Validate storage and session provider connectivity.

get_stats()

Get storage statistics and configuration info.

Grid Operations

get_canonical_prefix(session_id)

Get grid path prefix for session.

generate_artifact_key(session_id, artifact_id)

Generate grid artifact key.

๐Ÿ› ๏ธ Advanced Features

Error Handling

from chuk_artifacts import (
    ArtifactNotFoundError,
    ArtifactExpiredError, 
    ProviderError,
    SessionError,
    ArtifactStoreError  # For session security violations
)

try:
    await store.copy_file(artifact_id, target_session_id="other_session")
except ArtifactStoreError as e:
    print(f"Security violation: {e}")
except ArtifactNotFoundError:
    print("Artifact not found or expired")
except ProviderError as e:
    print(f"Storage provider error: {e}")

Validation and Monitoring

# Validate configuration
config_status = await store.validate_configuration()
print(f"Storage: {config_status['storage']['status']}")
print(f"Session: {config_status['session']['status']}")

# Get statistics
stats = await store.get_stats()
print(f"Provider: {stats['storage_provider']}")
print(f"Bucket: {stats['bucket']}")
print(f"Sandbox: {stats['sandbox_id']}")

Context Manager Usage

async with ArtifactStore() as store:
    artifact_id = await store.store(
        data=b"Temporary data",
        mime="text/plain",
        summary="Auto-cleanup example"
    )
    # Store automatically closed on exit

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes
  4. Run tests: uv run examples/mcp_test_demo.py
  5. Test session operations: uv run examples/session_operations_demo.py
  6. Submit a pull request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐ŸŽฏ Roadmap

  • Session-based security with strict isolation
  • Grid architecture with federation-ready paths
  • Modular design with specialized operation modules
  • High-performance operations (3,000+ ops/sec)
  • Directory-like operations with prefix filtering
  • Comprehensive testing with real-world scenarios
  • Azure Blob Storage provider
  • Google Cloud Storage provider
  • Encryption at rest
  • Artifact versioning
  • Webhook notifications
  • Prometheus metrics export
  • Federation implementation

Made with โค๏ธ for secure, scalable artifact storage

Production-ready artifact storage with mandatory session security and grid architecture

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chuk_artifacts-0.2.2.tar.gz (46.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chuk_artifacts-0.2.2-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file chuk_artifacts-0.2.2.tar.gz.

File metadata

  • Download URL: chuk_artifacts-0.2.2.tar.gz
  • Upload date:
  • Size: 46.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for chuk_artifacts-0.2.2.tar.gz
Algorithm Hash digest
SHA256 3abf07e3e0009445d0f21c382833ad57e2f572814495a0d4db406d2b958c7e58
MD5 3c28a0f5e7acc905ff919b7b5f282e50
BLAKE2b-256 bfb54a4cb49561dabfa81a2eb05780a830baaaf1f8128daf22f62803bad0e42a

See more details on using hashes here.

File details

Details for the file chuk_artifacts-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: chuk_artifacts-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for chuk_artifacts-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 30954db7b6f47ea7eb45bbcb46de13c994f871109d14aa0874ce0fc26500d42d
MD5 8eb4230f5bd55656e3bca75547c1a87d
BLAKE2b-256 9317cc0d78e13f389ec5fd961670129190177c3752bb1d934cc8510ab8352869

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page