Skip to main content

Single-Input-Reasoning: one LLM call, full action graph execution with evolutionary memory

Project description

SIR Logo

SIR — Single-Input-Reasoning

One LLM call. Full action graph. Evolutionary memory.

PyPI Python License


SIR is a Python SDK that lets developers delegate complex multi-step tasks to an LLM with a single inference call. Instead of the traditional ReAct loop (think → act → observe → repeat), SIR asks the LLM to produce an entire Directed Acyclic Graph (DAG) of actions in one shot, then executes it locally with parallelism, fan-out, retry, conditional branching, speculative execution, and DAG branching.

What makes SIR different

Feature ReAct Plan & Execute Chain-of-Tools SIR
LLM calls per taskN (one per step)1 + N11
Parallel executionFull DAG
Adaptive tool selection✅ (slow)✅ (slow)❌ hardcoded✅ (1 call)
Conditional branchingVia LLM re-callVia LLM re-callLocal eval
Fan-out (map-reduce)ManualManualBuilt-in
Speculative execution
DAG branching (multi-path)
Post-LLM graph optimization
Evolutionary memorydags.bin
Token efficiencyLowLowMediumCompressed
CostHigh (N calls)High (1+N)Medium (1)Minimal (1)

Installation

pip install sir-sdk              # core only
pip install sir-sdk[ollama]      # + Ollama support
pip install sir-sdk[openai]      # + OpenAI support
pip install sir-sdk[claude]      # + Anthropic Claude support
pip install sir-sdk[gemini]      # + Google Gemini support
pip install sir-sdk[bedrock]     # + AWS Bedrock support
pip install sir-sdk[openrouter]  # + OpenRouter support
pip install sir-sdk[perplexity]  # + Perplexity support
pip install sir-sdk[mistral]     # + Mistral support
pip install sir-sdk[all]         # everything

Quick Start

from sir import SIR, tool

@tool
def search_web(query: str) -> str:
    """Search the web."""
    return requests.get(f"https://api.search.com?q={query}").text

@tool
def summarize(text: str) -> str:
    """Summarize text."""
    return text[:200] + "..."

@tool
def translate(text: str, lang: str) -> str:
    """Translate text."""
    return translated_text

sir = SIR(model="qwen2.5:14b")
result = sir.run(
    "Search latest AI news, summarize, and translate to Italian",
    tools=[search_web, summarize, translate],
)
print(result.final_result)

That's it. One LLM call → full DAG → parallel execution → result.

How It Works

SIR DAG Example

sir.run(prompt, tools)
        │
        ▼
┌──────────────────────────────────┐
│ 1. Memory Lookup                 │ ← Semantic vector search in dags.bin
│ 2. Prompt Compilation            │ ← Compressed tool schemas + memory
│ 3. Single LLM Call               │ ← One inference → full action graph
│ 4. Graph Optimization            │ ← Dead-step elimination, dedup, dep relaxation
│ 5. Parallel Graph Execution      │ ← Topological sort → async + speculative
│ 6. Evolutionary Scoring          │ ← Score steps, deprecate bad ones
│ 7. Memory Persistence            │ ← Save to dags.bin
└──────────────────────────────────┘

Benchmarks

SIR vs Chain-of-Tools (Effectiveness)

SIR adaptively selects only the tools needed. Chain-of-Tools uses a hardcoded pipeline with unnecessary steps.

SIR vs Chain-of-Tools

Benchmarked across 5 complexity levels (L1: 2 tools → L5: 11 parallel steps) using the same LLM:

Metric SIR Chain-of-Tools
Avg Tool Efficiency100%71%
Avg Step Efficiency94%64%
Total Wasted Tools04
Total Wasted Steps013
Total Tokens5,6936,769 (−16%)
Total Wall Time17s28s (−40%)

Architecture

Graph Optimization (post-LLM)

After the LLM generates the DAG, SIR runs three compiler passes before execution:

  • Dead-step elimination — removes steps whose output is never referenced
  • Duplicate merge — merges steps calling the same tool with identical args
  • Dependency relaxation — removes unnecessary dependencies to unlock parallelism

Speculative Execution

While the current layer executes, SIR speculatively launches steps from the next layer if their dependencies are already available. This reduces total wall time on deep DAGs.

DAG Branching (Multi-Path)

Steps can define alternatives — multiple tool strategies that race in parallel:

{
  "id": "s1",
  "tool": "search",
  "args": {"query": "AI news"},
  "alternatives": [{"tool": "fetch_details", "args": {"entity": "AI"}}],
  "select": "fastest"
}

Strategies: fastest (first to succeed wins), shortest, longest.

Token Compression

SIR uses compressed JSON aliases to minimize token usage:

Full key Alias Savings
toolt3 tokens
argsa3 tokens
depends_ond9 tokens
conditionc8 tokens
foreachf6 tokens
final_stepfs9 tokens

The parser auto-expands aliases and is fully backward-compatible with full key names.

Tool Modes

SIR gives developers control over how much autonomy the LLM has in selecting tools:

Mode Behavior Use case
adaptive (default)LLM picks the minimum tools neededGeneric prompts, many tools available
strictALL tools passed must be used; LLM decides order/parallelism onlyPredictable pipelines, no surprises
requiredTools marked required=True are mandatory, others optionalMix of fixed + flexible
# Adaptive — LLM chooses
sir = SIR(tool_mode="adaptive")

# Strict — all tools must be used
sir = SIR(tool_mode="strict")

# Required — mark optional tools
@tool(required=False)
def cache(key: str, value: str) -> str: ...

sir = SIR(tool_mode="required")

Evolutionary Memory (dags.bin)

SIR persists every executed action graph in a binary file using msgpack with vector embeddings for semantic retrieval.

Run 1: LLM generates plan → execute → score → store in dags.bin
Run 2: Load prior plan → LLM sees scores/notes → improves plan → update
Run 3: Step X scored 2.1 → DEPRECATED → LLM replaces with better alternative
Run N: Converges to optimal action graph for this task

Each step stores:

  • score (0-10) — exponential moving average
  • notes — LLM annotations from previous runs
  • executions — how many times it ran
  • deprecatedtrue if score < threshold after ≥3 runs

Advanced Features

Conditional Branching

{"id":"s3","t":"notify","a":{"msg":"$s2.result"},"d":["s2"],
 "c":{"ref":"$s2.result","op":"contains","val":"error"}}

Fan-out (Map-Reduce)

{"id":"s2","t":"process","a":{"item":"$item"},"d":["s1"],"f":"$s1.result"}

Supports both $sN.result references and inline arrays:

{"id":"s1","t":"search","a":{"query":"$item"},"f":["topic A","topic B"]}

Retry Policy

{"id":"s1","t":"unreliable_api","a":{"url":"..."},"r":3}

Providers

All providers read API keys from environment variables by default. You can also pass them explicitly.

Ollama (default)

from sir.providers import OllamaProvider
sir = SIR(provider=OllamaProvider(model="qwen2.5:14b"))

OpenAI

from sir.providers import OpenAIProvider
sir = SIR(provider=OpenAIProvider(model="gpt-4o"))  # reads OPENAI_API_KEY

Claude (Anthropic)

from sir.providers import ClaudeProvider
sir = SIR(provider=ClaudeProvider(model="claude-sonnet-4-20250514"))  # reads ANTHROPIC_API_KEY

Gemini (Google)

from sir.providers import GeminiProvider
sir = SIR(provider=GeminiProvider(model="gemini-2.5-flash"))  # reads GEMINI_API_KEY

AWS Bedrock

from sir.providers import BedrockProvider
sir = SIR(provider=BedrockProvider(model="anthropic.claude-sonnet-4-20250514-v1:0"))  # reads AWS_REGION + AWS_BEARER_TOKEN_BEDROCK

OpenRouter

from sir.providers import OpenRouterProvider
sir = SIR(provider=OpenRouterProvider(model="openai/gpt-4o"))  # reads OPENROUTER_API_KEY

Perplexity

from sir.providers import PerplexityProvider
sir = SIR(provider=PerplexityProvider(model="sonar-pro"))  # reads PERPLEXITY_API_KEY

Mistral

from sir.providers import MistralProvider
sir = SIR(provider=MistralProvider(model="mistral-large-latest"))  # reads MISTRAL_API_KEY

Custom Provider

from sir.providers.llm import LLMProvider

class MyProvider(LLMProvider):
    async def generate(self, messages, **kwargs) -> str:
        return await my_custom_llm(messages)

Configuration

sir = SIR(
    provider=OllamaProvider(model="qwen2.5:14b"),
    memory_path="dags.bin",           # binary memory file
    enable_memory=True,               # toggle memory system
    enable_optimizer=True,            # toggle graph compression
    enable_speculation=True,          # toggle speculative execution
    tool_mode="adaptive",             # "adaptive" | "strict" | "required"
    deprecation_threshold=3.0,        # score below this → deprecated
    similarity_threshold=0.78,        # semantic memory match threshold
    max_tokens=4096,                  # LLM output limit
    llm_retries=2,                    # retry on LLM/parse failure
)

CLI

sir run "Search AI news and summarize" -t tools.py
sir run "..." -t tools.py --stream     # live streaming
sir inspect                             # view evolutionary memory
sir clear                               # clear memory

License

AGPL-3.0 — See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sir_sdk-0.0.2.tar.gz (35.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sir_sdk-0.0.2-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file sir_sdk-0.0.2.tar.gz.

File metadata

  • Download URL: sir_sdk-0.0.2.tar.gz
  • Upload date:
  • Size: 35.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for sir_sdk-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3969a0f9dde70773b84b5f1faa6b70eb12863f583f1c80f3e66b42bf32c81b50
MD5 52a2dad2e2b57ccb0d2791baf653a682
BLAKE2b-256 7cc4c7966f677e313f514d114e8f2cbf00d570f2c38df071e9d2439436e76c07

See more details on using hashes here.

File details

Details for the file sir_sdk-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: sir_sdk-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 39.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for sir_sdk-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 89deb1e705ea0c211fa19a86d3a4eb793c88faf8198b623591976d8e352d9476
MD5 40b09fc10241cd04935189e440aa3939
BLAKE2b-256 1bd16bae8fed19293ebd29512a28f5f3c40768d9297971b3218daacc6ee1c930

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page