Python SDK for fetching web content in SLIM format - optimized for AI consumption
Project description
slim-protocol
Python SDK for fetching web content in SLIM format - optimized for AI consumption with ~90% token reduction.
Features
- One-line usage -
slim = fetch_slim(url) - Sync + Async - Both sync and async APIs
- Full type hints - Complete type annotations
- Pydantic models - Validated response types
- AI integrations - LangChain and LlamaIndex support
- Python 3.9+ - Wide compatibility
Installation
pip install slim-protocol
# With LangChain integration
pip install slim-protocol[langchain]
# With LlamaIndex integration
pip install slim-protocol[llamaindex]
# All integrations
pip install slim-protocol[all]
Quick Start
from slim_protocol import fetch_slim
slim = fetch_slim("https://example.com")
# Access structured content
print(slim.payload.l1.title) # Page title
print(slim.payload.l1.type) # Content type (article, video, etc.)
print(slim.payload.l5.key_points) # Key points extracted
# Check compression metrics
print(slim.meta.tokens_estimate) # Estimated tokens
print(slim.meta.compression_ratio) # Compression achieved
Async Usage
from slim_protocol import async_fetch_slim
slim = await async_fetch_slim("https://example.com")
print(slim.payload.l1.title)
# Parallel fetching
import asyncio
async def fetch_many(urls):
tasks = [async_fetch_slim(url) for url in urls]
return await asyncio.gather(*tasks)
API
fetch_slim(url, **options)
Fetch web content in SLIM format (sync).
slim = fetch_slim(
"https://example.com",
proxy_url="https://my-proxy.com", # Override proxy URL
timeout=60, # Timeout in seconds (default: 30)
include_images=True, # Include image metadata (default: True)
include_videos=True, # Include video metadata (default: True)
)
async_fetch_slim(url, **options)
Fetch web content in SLIM format (async).
slim = await async_fetch_slim("https://example.com", timeout=60)
configure(**options)
Configure the SDK globally.
from slim_protocol import configure
configure(
proxy_url="https://my-proxy.com",
timeout=60,
debug=True,
)
is_valid_slim_url(url)
Check if a URL is valid for fetching.
from slim_protocol import is_valid_slim_url
if is_valid_slim_url(user_input):
slim = fetch_slim(user_input)
SLIM Pyramid Levels
The response contains hierarchical content levels:
| Level | Name | Contains |
|---|---|---|
| L1 | Identity | Title, type, author, description |
| L3 | Structure | Headings, sections, navigation |
| L5 | Insights | Key points, topics, entities |
| L7 | Full Content | Complete text content |
# L1: Always present - basic identification
slim.payload.l1.title
slim.payload.l1.type
slim.payload.l1.author
# L3: Document structure
slim.payload.l3.sections
slim.payload.l3.structure
# L5: Extracted insights
slim.payload.l5.key_points
slim.payload.l5.topics
slim.payload.l5.summary
# L7: Full content
slim.payload.l7.full_content
Error Handling
from slim_protocol import fetch_slim
from slim_protocol.exceptions import (
SlimError,
SlimInvalidUrlError,
SlimProxyError,
SlimTimeoutError,
SlimNetworkError,
)
try:
slim = fetch_slim(url)
except SlimInvalidUrlError as e:
print(f"Invalid URL: {e}")
except SlimTimeoutError as e:
print(f"Timeout: {e}")
except SlimProxyError as e:
print(f"Proxy error ({e.status_code}): {e}")
except SlimNetworkError as e:
print(f"Network error: {e}")
except SlimError as e:
print(f"Generic error: {e}")
if e.suggestion:
print(f"Suggestion: {e.suggestion}")
LangChain Integration
from slim_protocol.integrations.langchain import SlimLoader
# Load documents from URLs
loader = SlimLoader(urls=["https://example.com/article"])
documents = loader.load()
# Use in a chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
# Create a vector store from SLIM documents
# ... your vector store setup ...
qa_chain = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),
retriever=vectorstore.as_retriever(),
)
LlamaIndex Integration
from slim_protocol.integrations.llamaindex import SlimReader
from llama_index.core import VectorStoreIndex
# Load documents
reader = SlimReader()
documents = reader.load_data(urls=["https://example.com/article"])
# Create index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What is this article about?")
Environment Variables
Configure the SDK via environment variables:
export SLIM_PROXY_URL="https://my-proxy.com"
export SLIM_TIMEOUT="60"
export SLIM_DEBUG="true"
Type Hints
All types are exported for use in your code:
from slim_protocol import (
SlimResponse,
SlimPayload,
SlimL1, SlimL3, SlimL5, SlimL7,
SlimSource,
SlimMeta,
SlimConfig,
)
def process_slim(slim: SlimResponse) -> str:
return slim.payload.l1.title
Requirements
- Python 3.9+
- httpx
- pydantic
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file slim_protocol-1.0.0b1.tar.gz.
File metadata
- Download URL: slim_protocol-1.0.0b1.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be0c4434e5fc92274d4059c9aacbc06082aefee66e77992aeb05a60e5ef242d0
|
|
| MD5 |
47d548b7b1ad6d0552f5e10849affc6e
|
|
| BLAKE2b-256 |
a76307e72a31574f7a486f375067fb49cb325324ddd10ffff944885a51952508
|
File details
Details for the file slim_protocol-1.0.0b1-py3-none-any.whl.
File metadata
- Download URL: slim_protocol-1.0.0b1-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5abc603b9c3a7a8cee351c77ff6cdb0943c2c0cf8dc9676ceda6a2c0689c40bb
|
|
| MD5 |
bf71299007de71ec2082816a154fe5b1
|
|
| BLAKE2b-256 |
b737e61b8f86114503e8f0a452cf815ca754def64d1f4becdd94093087ec3b71
|