Skip to main content

Single-Input-Reasoning: one LLM call, full action graph execution with evolutionary memory

Project description

SIR Logo

SIR — Single-Input-Reasoning

One LLM call. Full action graph. Evolutionary memory.

PyPI Python License

Website | Installation | Quick Start | GitHub


SIR is a Python SDK that delegates complex multi-step tasks to an LLM with a single inference call. The LLM produces an entire Directed Acyclic Graph (DAG) of actions in one shot. SIR then executes it locally with parallelism, fan-out, retry, conditional branching, speculative execution, and DAG branching.

Table of Contents

What Makes SIR Different

Feature ReAct Plan & Execute Chain-of-Tools SIR
LLM calls per taskN (one per step)1 + N11
Parallel executionNoNoNoFull DAG
Adaptive tool selectionYes (slow)Yes (slow)No, hardcodedYes (1 call)
Conditional branchingVia LLM re-callVia LLM re-callNoLocal eval
Fan-out (map-reduce)ManualManualNoBuilt-in
Speculative executionNoNoNoYes
DAG branching (multi-path)NoNoNoYes
Post-LLM graph optimizationNoNoNoYes
Evolutionary memoryNoNoNodags.bin
Token efficiencyLowLowMediumCompressed
CostHigh (N calls)High (1+N)Medium (1)Minimal (1)

Installation

pip install sir-sdk              # core only
pip install sir-sdk[ollama]      # + Ollama support
pip install sir-sdk[openai]      # + OpenAI support
pip install sir-sdk[claude]      # + Anthropic Claude support
pip install sir-sdk[gemini]      # + Google Gemini support
pip install sir-sdk[bedrock]     # + AWS Bedrock support
pip install sir-sdk[openrouter]  # + OpenRouter support
pip install sir-sdk[perplexity]  # + Perplexity support
pip install sir-sdk[mistral]     # + Mistral support
pip install sir-sdk[all]         # everything

Requires Python 3.10+.

Quick Start

from sir import SIR, tool

@tool
def search_web(query: str) -> str:
    """Search the web."""
    return requests.get(f"https://api.search.com?q={query}").text

@tool
def summarize(text: str) -> str:
    """Summarize text."""
    return text[:200] + "..."

@tool
def translate(text: str, lang: str) -> str:
    """Translate text."""
    return translated_text

sir = SIR(model="qwen2.5:14b")
result = sir.run(
    "Search latest AI news, summarize, and translate to Italian",
    tools=[search_web, summarize, translate],
)
print(result.final_result)

One LLM call. Full DAG. Parallel execution. Result.

How It Works

sir.run(prompt, tools)
        |
        v
+----------------------------------+
| 1. Memory Lookup                 |  Semantic vector search in dags.bin
| 2. Prompt Compilation            |  Compressed tool schemas + memory
| 3. Single LLM Call               |  One inference -> full action graph
| 4. Graph Optimization            |  Dead-step elimination, dedup, dep relaxation
| 5. Parallel Graph Execution      |  Topological sort -> async + speculative
| 6. Evolutionary Scoring          |  Score steps, deprecate bad ones
| 7. Memory Persistence            |  Save to dags.bin
+----------------------------------+

Benchmarks

Overview

Benchmark Overview

SIR vs Chain-of-Tools

SIR adaptively selects only the tools needed. Chain-of-Tools uses a hardcoded pipeline with unnecessary steps.

SIR vs Chain-of-Tools

Benchmarked across 5 complexity levels (L1: 2 tools, L5: 11 parallel steps) using the same LLM:

Metric SIR Chain-of-Tools
Avg Tool Efficiency100%71%
Avg Step Efficiency94%64%
Total Wasted Tools04
Total Wasted Steps013
Total Tokens5,6936,769 (-16%)
Total Wall Time17s28s (-40%)

Architecture

Graph Optimization (post-LLM)

After the LLM generates the DAG, SIR runs three compiler passes before execution:

  • Dead-step elimination -- removes steps whose output is never referenced
  • Duplicate merge -- merges steps calling the same tool with identical args
  • Dependency relaxation -- removes unnecessary dependencies to unlock parallelism

Speculative Execution

While the current layer executes, SIR speculatively launches steps from the next layer if their dependencies are already available. This reduces total wall time on deep DAGs.

DAG Branching (Multi-Path)

Steps can define alternatives -- multiple tool strategies that race in parallel:

{
  "id": "s1",
  "tool": "search",
  "args": {"query": "AI news"},
  "alternatives": [{"tool": "fetch_details", "args": {"entity": "AI"}}],
  "select": "fastest"
}

Strategies: fastest (first to succeed wins), shortest, longest.

Token Compression

SIR uses compressed JSON aliases to minimize token usage:

Full key Alias Savings
toolt3 tokens
argsa3 tokens
depends_ond9 tokens
conditionc8 tokens
foreachf6 tokens
final_stepfs9 tokens

The parser auto-expands aliases and is fully backward-compatible with full key names.

Tool Modes

SIR gives developers control over how much autonomy the LLM has in selecting tools:

Mode Behavior Use case
adaptive (default)LLM picks the minimum tools neededGeneric prompts, many tools available
strictALL tools passed must be used; LLM decides order and parallelism onlyPredictable pipelines
requiredTools marked required=True are mandatory, others optionalMix of fixed and flexible
# Adaptive -- LLM chooses
sir = SIR(tool_mode="adaptive")

# Strict -- all tools must be used
sir = SIR(tool_mode="strict")

# Required -- mark optional tools
@tool(required=False)
def cache(key: str, value: str) -> str: ...

sir = SIR(tool_mode="required")

Evolutionary Memory

SIR persists every executed action graph in a binary file (dags.bin) using msgpack with vector embeddings for semantic retrieval.

Run 1: LLM generates plan -> execute -> score -> store in dags.bin
Run 2: Load prior plan -> LLM sees scores/notes -> improves plan -> update
Run 3: Step X scored 2.1 -> DEPRECATED -> LLM replaces with better alternative
Run N: Converges to optimal action graph for this task

Each step stores:

  • score (0-10) -- exponential moving average
  • notes -- LLM annotations from previous runs
  • executions -- how many times it ran
  • deprecated -- true if score falls below threshold after 3 or more runs

Advanced Features

Conditional Branching

{"id":"s3","t":"notify","a":{"msg":"$s2.result"},"d":["s2"],
 "c":{"ref":"$s2.result","op":"contains","val":"error"}}

Fan-out (Map-Reduce)

{"id":"s2","t":"process","a":{"item":"$item"},"d":["s1"],"f":"$s1.result"}

Supports both $sN.result references and inline arrays:

{"id":"s1","t":"search","a":{"query":"$item"},"f":["topic A","topic B"]}

Retry Policy

{"id":"s1","t":"unreliable_api","a":{"url":"..."},"r":3}

Providers

All providers read API keys from environment variables by default. You can also pass them explicitly.

Ollama (default)

from sir.providers import OllamaProvider
sir = SIR(provider=OllamaProvider(model="qwen2.5:14b"))

OpenAI

from sir.providers import OpenAIProvider
sir = SIR(provider=OpenAIProvider(model="gpt-4o"))  # reads OPENAI_API_KEY

Claude (Anthropic)

from sir.providers import ClaudeProvider
sir = SIR(provider=ClaudeProvider(model="claude-sonnet-4-20250514"))  # reads ANTHROPIC_API_KEY

Gemini (Google)

from sir.providers import GeminiProvider
sir = SIR(provider=GeminiProvider(model="gemini-2.5-flash"))  # reads GEMINI_API_KEY

AWS Bedrock

from sir.providers import BedrockProvider
sir = SIR(provider=BedrockProvider(model="anthropic.claude-sonnet-4-20250514-v1:0"))  # reads AWS_REGION + AWS_BEARER_TOKEN_BEDROCK

OpenRouter

from sir.providers import OpenRouterProvider
sir = SIR(provider=OpenRouterProvider(model="openai/gpt-4o"))  # reads OPENROUTER_API_KEY

Perplexity

from sir.providers import PerplexityProvider
sir = SIR(provider=PerplexityProvider(model="sonar-pro"))  # reads PERPLEXITY_API_KEY

Mistral

from sir.providers import MistralProvider
sir = SIR(provider=MistralProvider(model="mistral-large-latest"))  # reads MISTRAL_API_KEY

Custom Provider

from sir.providers.llm import LLMProvider

class MyProvider(LLMProvider):
    async def generate(self, messages, **kwargs) -> str:
        return await my_custom_llm(messages)

Configuration

sir = SIR(
    provider=OllamaProvider(model="qwen2.5:14b"),
    memory_path="dags.bin",           # binary memory file
    enable_memory=True,               # toggle memory system
    enable_optimizer=True,            # toggle graph compression
    enable_speculation=True,          # toggle speculative execution
    tool_mode="adaptive",             # "adaptive" | "strict" | "required"
    deprecation_threshold=3.0,        # score below this -> deprecated
    similarity_threshold=0.78,        # semantic memory match threshold
    max_tokens=4096,                  # LLM output limit
    llm_retries=2,                    # retry on LLM/parse failure
)

CLI

sir run "Search AI news and summarize" -t tools.py
sir run "..." -t tools.py --stream     # live streaming
sir inspect                             # view evolutionary memory
sir clear                               # clear memory

DAG Visualization

The following diagram shows an example of a DAG generated by SIR from a single LLM call. Each node represents a tool invocation, and edges represent data dependencies between steps.

SIR DAG Example

For more details on architecture, benchmarks, and interactive examples, visit the SIR website.

License

AGPL-3.0 -- See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sir_sdk-0.0.3.tar.gz (35.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sir_sdk-0.0.3-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file sir_sdk-0.0.3.tar.gz.

File metadata

  • Download URL: sir_sdk-0.0.3.tar.gz
  • Upload date:
  • Size: 35.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for sir_sdk-0.0.3.tar.gz
Algorithm Hash digest
SHA256 d86883f16fcce1f6dbdff33d6fb0bcac22085cbed7dfc6ef3afa8d456c7fa290
MD5 cf974c4eabf6daaf3c6c4b083ba42313
BLAKE2b-256 a7c8c0e498e843bec625ec5373720c3f44da92b61af5d3a543491653961e2c58

See more details on using hashes here.

File details

Details for the file sir_sdk-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: sir_sdk-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for sir_sdk-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 50eddc6c988ed801bd61cbb42f2bd52a6bf33301c356e628db07c4fe8a48f24f
MD5 7e5a847b0f70827b0908f82b2ef36c8b
BLAKE2b-256 6552095c3a1734a90ea255673658c9876d871f8faaafb1b3116b6349c638a688

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page