Single-Input-Reasoning: one LLM call, full action graph execution with evolutionary memory
Project description
SIR — Single-Input-Reasoning
One LLM call. Full action graph. Evolutionary memory.
Website | Installation | Quick Start | GitHub
SIR is a Python SDK that delegates complex multi-step tasks to an LLM with a single inference call. The LLM produces an entire Directed Acyclic Graph (DAG) of actions in one shot. SIR then executes it locally with parallelism, fan-out, retry, conditional branching, speculative execution, and DAG branching.
Table of Contents
- What Makes SIR Different
- Installation
- Quick Start
- How It Works
- Benchmarks
- Architecture
- Tool Modes
- Evolutionary Memory
- Advanced Features
- Providers
- Configuration
- CLI
- DAG Visualization
- License
What Makes SIR Different
| Feature | ReAct | Plan & Execute | Chain-of-Tools | SIR |
|---|---|---|---|---|
| LLM calls per task | N (one per step) | 1 + N | 1 | 1 |
| Parallel execution | No | No | No | Full DAG |
| Adaptive tool selection | Yes (slow) | Yes (slow) | No, hardcoded | Yes (1 call) |
| Conditional branching | Via LLM re-call | Via LLM re-call | No | Local eval |
| Fan-out (map-reduce) | Manual | Manual | No | Built-in |
| Speculative execution | No | No | No | Yes |
| DAG branching (multi-path) | No | No | No | Yes |
| Post-LLM graph optimization | No | No | No | Yes |
| Evolutionary memory | No | No | No | dags.bin |
| Token efficiency | Low | Low | Medium | Compressed |
| Cost | High (N calls) | High (1+N) | Medium (1) | Minimal (1) |
Installation
pip install sir-sdk # core only
pip install sir-sdk[ollama] # + Ollama support
pip install sir-sdk[openai] # + OpenAI support
pip install sir-sdk[claude] # + Anthropic Claude support
pip install sir-sdk[gemini] # + Google Gemini support
pip install sir-sdk[bedrock] # + AWS Bedrock support
pip install sir-sdk[openrouter] # + OpenRouter support
pip install sir-sdk[perplexity] # + Perplexity support
pip install sir-sdk[mistral] # + Mistral support
pip install sir-sdk[all] # everything
Requires Python 3.10+.
Quick Start
from sir import SIR, tool
@tool
def search_web(query: str) -> str:
"""Search the web."""
return requests.get(f"https://api.search.com?q={query}").text
@tool
def summarize(text: str) -> str:
"""Summarize text."""
return text[:200] + "..."
@tool
def translate(text: str, lang: str) -> str:
"""Translate text."""
return translated_text
sir = SIR(model="qwen2.5:14b")
result = sir.run(
"Search latest AI news, summarize, and translate to Italian",
tools=[search_web, summarize, translate],
)
print(result.final_result)
One LLM call. Full DAG. Parallel execution. Result.
How It Works
sir.run(prompt, tools)
|
v
+----------------------------------+
| 1. Memory Lookup | Semantic vector search in dags.bin
| 2. Prompt Compilation | Compressed tool schemas + memory
| 3. Single LLM Call | One inference -> full action graph
| 4. Graph Optimization | Dead-step elimination, dedup, dep relaxation
| 5. Parallel Graph Execution | Topological sort -> async + speculative
| 6. Evolutionary Scoring | Score steps, deprecate bad ones
| 7. Memory Persistence | Save to dags.bin
+----------------------------------+
Benchmarks
Overview
SIR vs Chain-of-Tools
SIR adaptively selects only the tools needed. Chain-of-Tools uses a hardcoded pipeline with unnecessary steps.
Benchmarked across 5 complexity levels (L1: 2 tools, L5: 11 parallel steps) using the same LLM:
| Metric | SIR | Chain-of-Tools |
|---|---|---|
| Avg Tool Efficiency | 100% | 71% |
| Avg Step Efficiency | 94% | 64% |
| Total Wasted Tools | 0 | 4 |
| Total Wasted Steps | 0 | 13 |
| Total Tokens | 5,693 | 6,769 (-16%) |
| Total Wall Time | 17s | 28s (-40%) |
Architecture
Graph Optimization (post-LLM)
After the LLM generates the DAG, SIR runs three compiler passes before execution:
- Dead-step elimination -- removes steps whose output is never referenced
- Duplicate merge -- merges steps calling the same tool with identical args
- Dependency relaxation -- removes unnecessary dependencies to unlock parallelism
Speculative Execution
While the current layer executes, SIR speculatively launches steps from the next layer if their dependencies are already available. This reduces total wall time on deep DAGs.
DAG Branching (Multi-Path)
Steps can define alternatives -- multiple tool strategies that race in parallel:
{
"id": "s1",
"tool": "search",
"args": {"query": "AI news"},
"alternatives": [{"tool": "fetch_details", "args": {"entity": "AI"}}],
"select": "fastest"
}
Strategies: fastest (first to succeed wins), shortest, longest.
Token Compression
SIR uses compressed JSON aliases to minimize token usage:
| Full key | Alias | Savings |
|---|---|---|
tool | t | 3 tokens |
args | a | 3 tokens |
depends_on | d | 9 tokens |
condition | c | 8 tokens |
foreach | f | 6 tokens |
final_step | fs | 9 tokens |
The parser auto-expands aliases and is fully backward-compatible with full key names.
Tool Modes
SIR gives developers control over how much autonomy the LLM has in selecting tools:
| Mode | Behavior | Use case |
|---|---|---|
adaptive (default) | LLM picks the minimum tools needed | Generic prompts, many tools available |
strict | ALL tools passed must be used; LLM decides order and parallelism only | Predictable pipelines |
required | Tools marked required=True are mandatory, others optional | Mix of fixed and flexible |
# Adaptive -- LLM chooses
sir = SIR(tool_mode="adaptive")
# Strict -- all tools must be used
sir = SIR(tool_mode="strict")
# Required -- mark optional tools
@tool(required=False)
def cache(key: str, value: str) -> str: ...
sir = SIR(tool_mode="required")
Evolutionary Memory
SIR persists every executed action graph in a binary file (dags.bin) using msgpack with vector embeddings for semantic retrieval.
Run 1: LLM generates plan -> execute -> score -> store in dags.bin
Run 2: Load prior plan -> LLM sees scores/notes -> improves plan -> update
Run 3: Step X scored 2.1 -> DEPRECATED -> LLM replaces with better alternative
Run N: Converges to optimal action graph for this task
Each step stores:
- score (0-10) -- exponential moving average
- notes -- LLM annotations from previous runs
- executions -- how many times it ran
- deprecated -- true if score falls below threshold after 3 or more runs
Advanced Features
Conditional Branching
{"id":"s3","t":"notify","a":{"msg":"$s2.result"},"d":["s2"],
"c":{"ref":"$s2.result","op":"contains","val":"error"}}
Fan-out (Map-Reduce)
{"id":"s2","t":"process","a":{"item":"$item"},"d":["s1"],"f":"$s1.result"}
Supports both $sN.result references and inline arrays:
{"id":"s1","t":"search","a":{"query":"$item"},"f":["topic A","topic B"]}
Retry Policy
{"id":"s1","t":"unreliable_api","a":{"url":"..."},"r":3}
Providers
All providers read API keys from environment variables by default. You can also pass them explicitly.
Ollama (default)
from sir.providers import OllamaProvider
sir = SIR(provider=OllamaProvider(model="qwen2.5:14b"))
OpenAI
from sir.providers import OpenAIProvider
sir = SIR(provider=OpenAIProvider(model="gpt-4o")) # reads OPENAI_API_KEY
Claude (Anthropic)
from sir.providers import ClaudeProvider
sir = SIR(provider=ClaudeProvider(model="claude-sonnet-4-20250514")) # reads ANTHROPIC_API_KEY
Gemini (Google)
from sir.providers import GeminiProvider
sir = SIR(provider=GeminiProvider(model="gemini-2.5-flash")) # reads GEMINI_API_KEY
AWS Bedrock
from sir.providers import BedrockProvider
sir = SIR(provider=BedrockProvider(model="anthropic.claude-sonnet-4-20250514-v1:0")) # reads AWS_REGION + AWS_BEARER_TOKEN_BEDROCK
OpenRouter
from sir.providers import OpenRouterProvider
sir = SIR(provider=OpenRouterProvider(model="openai/gpt-4o")) # reads OPENROUTER_API_KEY
Perplexity
from sir.providers import PerplexityProvider
sir = SIR(provider=PerplexityProvider(model="sonar-pro")) # reads PERPLEXITY_API_KEY
Mistral
from sir.providers import MistralProvider
sir = SIR(provider=MistralProvider(model="mistral-large-latest")) # reads MISTRAL_API_KEY
Custom Provider
from sir.providers.llm import LLMProvider
class MyProvider(LLMProvider):
async def generate(self, messages, **kwargs) -> str:
return await my_custom_llm(messages)
Configuration
sir = SIR(
provider=OllamaProvider(model="qwen2.5:14b"),
memory_path="dags.bin", # binary memory file
enable_memory=True, # toggle memory system
enable_optimizer=True, # toggle graph compression
enable_speculation=True, # toggle speculative execution
tool_mode="adaptive", # "adaptive" | "strict" | "required"
deprecation_threshold=3.0, # score below this -> deprecated
similarity_threshold=0.78, # semantic memory match threshold
max_tokens=4096, # LLM output limit
llm_retries=2, # retry on LLM/parse failure
)
CLI
sir run "Search AI news and summarize" -t tools.py
sir run "..." -t tools.py --stream # live streaming
sir inspect # view evolutionary memory
sir clear # clear memory
DAG Visualization
The following diagram shows an example of a DAG generated by SIR from a single LLM call. Each node represents a tool invocation, and edges represent data dependencies between steps.
For more details on architecture, benchmarks, and interactive examples, visit the SIR website.
License
AGPL-3.0 -- See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sir_sdk-0.0.3.tar.gz.
File metadata
- Download URL: sir_sdk-0.0.3.tar.gz
- Upload date:
- Size: 35.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d86883f16fcce1f6dbdff33d6fb0bcac22085cbed7dfc6ef3afa8d456c7fa290
|
|
| MD5 |
cf974c4eabf6daaf3c6c4b083ba42313
|
|
| BLAKE2b-256 |
a7c8c0e498e843bec625ec5373720c3f44da92b61af5d3a543491653961e2c58
|
File details
Details for the file sir_sdk-0.0.3-py3-none-any.whl.
File metadata
- Download URL: sir_sdk-0.0.3-py3-none-any.whl
- Upload date:
- Size: 40.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50eddc6c988ed801bd61cbb42f2bd52a6bf33301c356e628db07c4fe8a48f24f
|
|
| MD5 |
7e5a847b0f70827b0908f82b2ef36c8b
|
|
| BLAKE2b-256 |
6552095c3a1734a90ea255673658c9876d871f8faaafb1b3116b6349c638a688
|