Skip to main content

Unified Content Protocol - Python SDK for LLM content manipulation

Project description

ucp-content

Unified Content Protocol SDK for Python.

Build LLM-powered content manipulation with minimal code.

Installation

pip install ucp-content

Quick Start

import ucp

# 1. Parse markdown into a document
doc = ucp.parse("""
# My Article

This is the introduction.

## Section 1

Some content here.
""")

# 2. Create an ID mapper for token efficiency
mapper = ucp.map_ids(doc)

# 3. Get a compact document description for the LLM
description = mapper.describe(doc)
# Output:
# Document Structure:
#   [2] heading1 - My Article
#     [3] paragraph - This is the introduction.
#     [4] heading2 - Section 1
#       [5] paragraph - Some content here.

# 4. Build a prompt with only the capabilities you need
system_prompt = (ucp.prompt()
    .edit()
    .append()
    .with_short_ids()
    .build())

# 5. After LLM responds, expand short IDs back to full IDs
llm_response = 'EDIT 3 SET text = "Updated intro"'
expanded_ucl = mapper.expand(llm_response)
# Result: 'EDIT blk_000000000003 SET text = "Updated intro"'

API Reference

Document Operations

# Parse markdown
doc = ucp.parse('# Hello\n\nWorld')

# Render back to markdown
md = ucp.render(doc)

# Create empty document
doc = ucp.create()

Prompt Builder

Build prompts with only the capabilities your agent needs:

prompt = (ucp.prompt()
    .edit()           # Enable EDIT command
    .append()         # Enable APPEND command
    .move()           # Enable MOVE command
    .delete()         # Enable DELETE command
    .link()           # Enable LINK/UNLINK commands
    .snapshot()       # Enable SNAPSHOT commands
    .transaction()    # Enable ATOMIC transactions
    .all()            # Enable all capabilities
    .with_short_ids() # Use short numeric IDs
    .with_rule('Keep responses concise')
    .build())

ID Mapper

Save tokens by using short numeric IDs:

mapper = ucp.map_ids(doc)

# Shorten IDs in any text
short = mapper.shorten('Block blk_000000000003 has content')
# Result: 'Block 3 has content'

# Expand IDs in UCL commands
expanded = mapper.expand('EDIT 3 SET text = "hello"')
# Result: 'EDIT blk_000000000003 SET text = "hello"'

# Get document description with short IDs
desc = mapper.describe(doc)

UCL Builder

Build UCL commands programmatically:

commands = (ucp.ucl()
    .edit(3, 'Updated content')
    .append(2, 'New paragraph')
    .delete(5)
    .atomic()  # Wrap in ATOMIC block
    .build())

Token Efficiency

Using short IDs can significantly reduce token usage:

ID Format Example Tokens
Long blk_000000000003 ~6
Short 3 1

For a document with 50 blocks referenced 3 times each, this saves ~750 tokens.

Type Hints

Full type hint support:

from ucp import Document, Block, ContentType, SemanticRole, Capability

Deterministic Block IDs

Block identifiers now follow the canonical blk_XXXXXXXXXXXX pattern (root is always blk_000000000000). IDs are computed as:

  1. Normalize content to NFC.
  2. SHA-256 hash content || semantic_role || namespace.
  3. Take the first 12 hex characters and prefix with blk_.

This matches the reference implementation, ensuring Python documents can round trip with Rust/TypeScript.

from ucp import Block, SemanticRole

heading = Block.new("Overview", role=SemanticRole.HEADING2)
print(heading.id)  # e.g. blk_a1c3b8f1d2e4

Error Handling

The SDK raises descriptive exceptions for invalid operations:

import ucp
from ucp import Document

doc = ucp.create()

# Block not found
try:
    doc.delete_block("blk_nonexistent")
except KeyError as e:
    print(e)  # "Block not found: blk_nonexistent"

# Cannot delete root
try:
    doc.delete_block(doc.root_id)
except ValueError as e:
    print(e)  # "Cannot delete the root block"

# Cannot move into self
try:
    block_id = doc.add_block(doc.root_id, "Test")
    doc.move_block(block_id, block_id)
except ValueError as e:
    print(e)  # "Cannot move a block into itself or its descendants"

Validation

result = doc.validate()

if not result.valid:
    for issue in result.issues:
        print(f"[{issue.severity}] {issue.code}: {issue.message}")
        # [error] E201: Document structure contains a cycle
        # [warning] E203: Block blk_123 is unreachable from root

Error Codes

Code Severity Description
E001 Error Block not found
E201 Error Cycle detected in document
E203 Warning Orphaned/unreachable block
E400 Error Block count limit exceeded
E402 Error Block size limit exceeded
E403 Error Nesting depth limit exceeded
E404 Error Edge count limit exceeded

Observability

The SDK includes built-in observability features:

import ucp
from ucp import get_logger, on_event, trace, record_metric

# Logging
logger = get_logger()
logger.info("Starting document processing")

# Event subscription
@on_event("block_added")
def handle_block_added(event):
    print(f"Block {event.block_id} added")

# Tracing
with trace("parse_document") as span:
    doc = ucp.parse(markdown_content)
    span.set_attribute("block_count", len(doc.blocks))

# Metrics
record_metric("documents_parsed", 1)

Snapshots and Transactions

import ucp
from ucp import transaction

doc = ucp.create()

# Create a snapshot
snapshot_mgr = ucp.SnapshotManager()
snapshot_mgr.create("before_changes", doc)

# Use transactions for atomic operations
with transaction(doc) as txn:
    doc.add_block(doc.root_id, "Block 1")
    doc.add_block(doc.root_id, "Block 2")
    # Commits automatically on success
    # Rolls back on exception

# Restore from snapshot
restored_doc = snapshot_mgr.restore("before_changes")

Async Support

For async applications:

import asyncio
import ucp

async def process_documents():
    # Most operations are synchronous but can be wrapped
    doc = await asyncio.to_thread(ucp.parse, large_markdown)
    return doc

Conformance

This SDK implements the UCP specification. See docs/conformance/README.md for the full specification and test vectors.

Run the Python conformance suite with:

PYTHONPATH=src python3 -m pytest tests/conformance/test_conformance.py

All 26 reference tests pass against the current SDK.

UCL Execution Summary

execute_ucl now returns an ExecutionSummary that exposes the aggregated status and affected blocks:

import ucp

doc = ucp.create()
summary = ucp.execute_ucl(doc, 'APPEND blk_000000000000 text :: "Hello"')

if summary.success:
    print("Blocks touched:", summary.affected_blocks)

Individual ExecutionResult objects remain available via summary.results.

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run conformance tests
pytest tests/conformance/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ucp_content-0.1.3.tar.gz (36.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ucp_content-0.1.3-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file ucp_content-0.1.3.tar.gz.

File metadata

  • Download URL: ucp_content-0.1.3.tar.gz
  • Upload date:
  • Size: 36.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ucp_content-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1428db5619f3fe9b8fb65fac977a1f73c8535cb76e0a3a94d6e8a553a67616a9
MD5 838c8d1674e61f205aa32a7610ba11c5
BLAKE2b-256 93b12b8614344121dc81b29e430d9b36ab42e46d0a7c7fffbee96b9732ccd001

See more details on using hashes here.

File details

Details for the file ucp_content-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: ucp_content-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 36.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ucp_content-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3d228309ff320a67036681c9950f8cd25d428d1e4a569f9ad1f73603e8738017
MD5 8ade9d5907213904bfced3ee4dbbc6d6
BLAKE2b-256 d0875941f6c51f2d08fa5b4fa5626bb360b45f7bb9ae63cfbd27222b3cea7408

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page