Skip to main content

Python SDK for fetching web content in SLIM format - optimized for AI consumption

Project description

slim-protocol

Python SDK for fetching web content in SLIM format - optimized for AI consumption with ~90% token reduction.

Features

  • One-line usage - slim = fetch_slim(url)
  • Sync + Async - Both sync and async APIs
  • Full type hints - Complete type annotations
  • Pydantic models - Validated response types
  • AI integrations - LangChain and LlamaIndex support
  • Python 3.9+ - Wide compatibility

Installation

pip install slim-protocol

# With LangChain integration
pip install slim-protocol[langchain]

# With LlamaIndex integration
pip install slim-protocol[llamaindex]

# All integrations
pip install slim-protocol[all]

Quick Start

from slim_protocol import fetch_slim

slim = fetch_slim("https://example.com")

# Access structured content
print(slim.payload.l1.title)      # Page title
print(slim.payload.l1.type)       # Content type (article, video, etc.)
print(slim.payload.l5.key_points) # Key points extracted

# Check compression metrics
print(slim.meta.tokens_estimate)     # Estimated tokens
print(slim.meta.compression_ratio)   # Compression achieved

Async Usage

from slim_protocol import async_fetch_slim

slim = await async_fetch_slim("https://example.com")
print(slim.payload.l1.title)

# Parallel fetching
import asyncio

async def fetch_many(urls):
    tasks = [async_fetch_slim(url) for url in urls]
    return await asyncio.gather(*tasks)

API

fetch_slim(url, **options)

Fetch web content in SLIM format (sync).

slim = fetch_slim(
    "https://example.com",
    proxy_url="https://my-proxy.com",  # Override proxy URL
    timeout=60,                         # Timeout in seconds (default: 30)
    include_images=True,                # Include image metadata (default: True)
    include_videos=True,                # Include video metadata (default: True)
)

async_fetch_slim(url, **options)

Fetch web content in SLIM format (async).

slim = await async_fetch_slim("https://example.com", timeout=60)

configure(**options)

Configure the SDK globally.

from slim_protocol import configure

configure(
    proxy_url="https://my-proxy.com",
    timeout=60,
    debug=True,
)

is_valid_slim_url(url)

Check if a URL is valid for fetching.

from slim_protocol import is_valid_slim_url

if is_valid_slim_url(user_input):
    slim = fetch_slim(user_input)

SLIM Pyramid Levels

The response contains hierarchical content levels:

Level Name Contains
L1 Identity Title, type, author, description
L3 Structure Headings, sections, navigation
L5 Insights Key points, topics, entities
L7 Full Content Complete text content
# L1: Always present - basic identification
slim.payload.l1.title
slim.payload.l1.type
slim.payload.l1.author

# L3: Document structure
slim.payload.l3.sections
slim.payload.l3.structure

# L5: Extracted insights
slim.payload.l5.key_points
slim.payload.l5.topics
slim.payload.l5.summary

# L7: Full content
slim.payload.l7.full_content

Error Handling

from slim_protocol import fetch_slim
from slim_protocol.exceptions import (
    SlimError,
    SlimInvalidUrlError,
    SlimProxyError,
    SlimTimeoutError,
    SlimNetworkError,
)

try:
    slim = fetch_slim(url)
except SlimInvalidUrlError as e:
    print(f"Invalid URL: {e}")
except SlimTimeoutError as e:
    print(f"Timeout: {e}")
except SlimProxyError as e:
    print(f"Proxy error ({e.status_code}): {e}")
except SlimNetworkError as e:
    print(f"Network error: {e}")
except SlimError as e:
    print(f"Generic error: {e}")
    if e.suggestion:
        print(f"Suggestion: {e.suggestion}")

LangChain Integration

from slim_protocol.integrations.langchain import SlimLoader

# Load documents from URLs
loader = SlimLoader(urls=["https://example.com/article"])
documents = loader.load()

# Use in a chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Create a vector store from SLIM documents
# ... your vector store setup ...

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    retriever=vectorstore.as_retriever(),
)

LlamaIndex Integration

from slim_protocol.integrations.llamaindex import SlimReader
from llama_index.core import VectorStoreIndex

# Load documents
reader = SlimReader()
documents = reader.load_data(urls=["https://example.com/article"])

# Create index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query("What is this article about?")

Environment Variables

Configure the SDK via environment variables:

export SLIM_PROXY_URL="https://my-proxy.com"
export SLIM_TIMEOUT="60"
export SLIM_DEBUG="true"

Type Hints

All types are exported for use in your code:

from slim_protocol import (
    SlimResponse,
    SlimPayload,
    SlimL1, SlimL3, SlimL5, SlimL7,
    SlimSource,
    SlimMeta,
    SlimConfig,
)

def process_slim(slim: SlimResponse) -> str:
    return slim.payload.l1.title

Requirements

  • Python 3.9+
  • httpx
  • pydantic

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slim_protocol-1.0.0b1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slim_protocol-1.0.0b1-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file slim_protocol-1.0.0b1.tar.gz.

File metadata

  • Download URL: slim_protocol-1.0.0b1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for slim_protocol-1.0.0b1.tar.gz
Algorithm Hash digest
SHA256 be0c4434e5fc92274d4059c9aacbc06082aefee66e77992aeb05a60e5ef242d0
MD5 47d548b7b1ad6d0552f5e10849affc6e
BLAKE2b-256 a76307e72a31574f7a486f375067fb49cb325324ddd10ffff944885a51952508

See more details on using hashes here.

File details

Details for the file slim_protocol-1.0.0b1-py3-none-any.whl.

File metadata

File hashes

Hashes for slim_protocol-1.0.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 5abc603b9c3a7a8cee351c77ff6cdb0943c2c0cf8dc9676ceda6a2c0689c40bb
MD5 bf71299007de71ec2082816a154fe5b1
BLAKE2b-256 b737e61b8f86114503e8f0a452cf815ca754def64d1f4becdd94093087ec3b71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page