Make your AI agents leaner, faster, and cheaper — smart context management and token compression

These details have not been verified by PyPI

Project links

Project description

agentslim 🪶

Make your AI agents leaner, faster, and cheaper.

agentslim is a zero-dependency Python toolkit that reduces token consumption in LLM-powered agents — without sacrificing reasoning quality.

Why?

Every token counts — literally. When building agents you routinely waste tokens on:

Problem	Typical waste
Verbose JSON tool schemas	200–800 tokens per request
Raw HTML web scrapes fed to the LLM	60–80% noise
Naively truncated chat history	Lost context, broken reasoning
Sending entire source files to coding agents	10× more than needed

agentslim solves all four with one clean API.

Install

pip install agentslim

For accurate token counting (uses tiktoken under the hood):

pip install agentslim[tiktoken]

Quick Start

from agentslim import Compressor, AgentMemory, ToolMinifier, CodeContext

# 1 ── Compress any content before sending to LLM
c = Compressor()
slim = c.compress(raw_html_or_json_or_text)   # auto-detects format

# 2 ── Smart context window with auto-summarization
mem = AgentMemory(max_tokens=6000)
mem.add("user", "Build me a FastAPI app")
mem.add("assistant", "Sure! Here's the plan...")
messages = mem.get_messages()   # ready for openai.chat.completions.create()

# 3 ── Minify tool schemas
slim_tools = ToolMinifier.minify(my_tools)          # shorter descriptions
hint_str   = ToolMinifier.to_compact_str(my_tools)  # one-liner per tool

# 4 ── Send only the relevant code chunk, not the whole file
snippet = CodeContext.extract_function("app.py", "handle_request")
outline = CodeContext.outline("app.py")   # class/function map

Modules

🗜️ `Compressor` — Text / HTML / JSON compressor

Strips noise from content before it hits your LLM.

from agentslim import Compressor
from agentslim.compressor import CompressorConfig

# Defaults — safe for most use cases
c = Compressor()

# Fine-grained control
c = Compressor(config=CompressorConfig(
    strip_html=True,
    remove_decorative_html=True,   # drops <script>, <style>, <nav>, etc.
    collapse_whitespace=True,
    remove_filler_phrases=True,    # "Certainly! As an AI language model..."
    compact_json=True,
    remove_python_comments=False,  # keep comments by default
))

clean = c.compress(raw_content)          # auto-detects JSON / HTML / text
clean = c.compress_html(html_string)
clean = c.compress_json(json_string)
clean = c.compress_text(plain_text)
clean = c.compress_code(source, language="python")  # or "js" / "ts"

Savings report:

from agentslim.utils import tokens_saved_report

report = tokens_saved_report(original, compressed, model="gpt-4o")
# {
#   'original_tokens': 1842,
#   'compressed_tokens': 612,
#   'tokens_saved': 1230,
#   'percent_saved': 66.8,
#   'cost_saved_usd': 0.003075
# }

🧠 `AgentMemory` — Smart sliding-window context manager

Instead of naively cutting old messages (which breaks reasoning), AgentMemory auto-summarizes the oldest messages into a compact system note.

from agentslim import AgentMemory

mem = AgentMemory(
    max_tokens=6000,       # soft limit on the active window
    archive_ratio=0.4,     # archive the oldest 40% when limit is hit
    summarize_fn=None,     # optional: plug in your LLM for better summaries
)

mem.add("system", "You are a helpful assistant.")
mem.add("user", "Hello!")
mem.add("assistant", "Hi! How can I help?")

messages = mem.get_messages()   # list[dict] — pass directly to any OpenAI-compatible API
print(mem.stats())
# MemoryStats(active_messages=3, archived=0, active_tokens=24, ...)

With a real LLM summarizer:

import openai

def gpt_summarize(messages):
    history = "\n".join(f"{m.role}: {m.content}" for m in messages)
    resp = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Summarize in 3 sentences."},
            {"role": "user",   "content": history},
        ],
    )
    return resp.choices[0].message.content

mem = AgentMemory(max_tokens=8000, summarize_fn=gpt_summarize)

🛠️ `ToolMinifier` — Tool schema minifier

OpenAI function schemas are JSON-heavy. ToolMinifier cuts them down.

from agentslim import ToolMinifier

# Option A: minify but keep JSON format (for the API)
slim_tools = ToolMinifier.minify(tools, max_desc=80)

# Option B: ultra-compact one-liner hint for system prompts
print(ToolMinifier.to_compact_str(tools))
# get_weather(location:string, unit:string?) -> Any  # Get current weather…
# send_email(to:string, subject:string, body:string) -> Any

# Option C: auto-generate schemas from Python functions
def search(query: str, max_results: int) -> str:
    """Search the web for real-time info."""
    ...

tools = ToolMinifier.from_python_functions(search)

Format	Tokens (example)
Full verbose JSON	~520
`minify()`	~310
`to_compact_str()`	~40

📄 `CodeContext` — Code-aware chunk extractor

Don't send 500-line files to your coding agent — send only what it needs.

from agentslim import CodeContext

# Extract a single function (+ N lines of context)
snippet = CodeContext.extract_function("app.py", "process_payment", context_lines=3)

# Extract a class skeleton (signatures only)
skeleton = CodeContext.extract_class("service.py", "PaymentService", methods_only=True)

# Outline: class/function map of the whole file
outline = CodeContext.outline("app.py")
# ['class PaymentService (L12)', 'def charge (L28)', 'def refund (L45)']

# Folded view: function bodies replaced with '...'
folded = CodeContext.folded("large_module.py")

# Extract specific line range
chunk = CodeContext.extract_lines("app.py", start_line=120, end_line=145, context_lines=5)

View	Tokens saved
Full source	0%
Folded	~55%
Outline only	~85%

📊 `utils` — Token counting & cost estimation

from agentslim.utils import count_tokens, estimate_cost

tokens = count_tokens("Hello, world!")  # uses tiktoken if available

cost = estimate_cost(input_tokens=1000, output_tokens=200, model="gpt-4o")
# {'input_usd': 0.0025, 'output_usd': 0.002, 'total_usd': 0.0045}

Supported models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, claude-3-5-sonnet, claude-3-haiku, gemini-1.5-pro, gemini-1.5-flash.

Compatibility

agentslim is framework-agnostic. It works with anything that accepts a list of {"role": ..., "content": ...} dicts:

✅ OpenAI Python SDK
✅ LangChain / LangGraph
✅ LlamaIndex
✅ Anthropic SDK
✅ Google Generative AI SDK
✅ Any custom agent framework

Running tests

pip install -e ".[dev]"
pytest

Contributing

PRs and issues welcome! See CONTRIBUTING.md.

License

MIT © agentslim contributors

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentslim-0.1.0.tar.gz (23.2 kB view details)

Uploaded Jun 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentslim-0.1.0-py3-none-any.whl (18.3 kB view details)

Uploaded Jun 6, 2026 Python 3

File details

Details for the file agentslim-0.1.0.tar.gz.

File metadata

Download URL: agentslim-0.1.0.tar.gz
Upload date: Jun 6, 2026
Size: 23.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for agentslim-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8390ad7dc1c6659c601a3bc6f09efe805c6c4432a87eee9894158218c339d59c`
MD5	`c13afed7053c2d09c141765392dad3e2`
BLAKE2b-256	`6cb2a14ff51283f8c23ef8b1719df09f5fd4a56896e77cbc5475e86e3f88401c`

See more details on using hashes here.

File details

Details for the file agentslim-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentslim-0.1.0-py3-none-any.whl
Upload date: Jun 6, 2026
Size: 18.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for agentslim-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c58c1f014955ffe00f8a1bc9c645bf3522da2da509a311b7875ef680eea6657`
MD5	`99c363b0ea5a699d87e11b8900277e93`
BLAKE2b-256	`5918c30afbfbd00fa13a0fdd611f6c14ce46ace449f9da80c202b41c18f4eb35`

See more details on using hashes here.

agentslim 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agentslim 🪶

Why?

Install

Quick Start

Modules

🗜️ `Compressor` — Text / HTML / JSON compressor

🧠 `AgentMemory` — Smart sliding-window context manager

🛠️ `ToolMinifier` — Tool schema minifier

📄 `CodeContext` — Code-aware chunk extractor

📊 `utils` — Token counting & cost estimation

Compatibility

Running tests

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

agentslim 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agentslim 🪶

Why?

Install

Quick Start

Modules

🗜️ Compressor — Text / HTML / JSON compressor

🧠 AgentMemory — Smart sliding-window context manager

🛠️ ToolMinifier — Tool schema minifier

📄 CodeContext — Code-aware chunk extractor

📊 utils — Token counting & cost estimation

Compatibility

Running tests

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

🗜️ `Compressor` — Text / HTML / JSON compressor

🧠 `AgentMemory` — Smart sliding-window context manager

🛠️ `ToolMinifier` — Tool schema minifier

📄 `CodeContext` — Code-aware chunk extractor

📊 `utils` — Token counting & cost estimation