A Tiny VM for LLM-Powered Programs
Project description
Structured Skills (ss)
Structured Skills is a minimal stack-based virtual machine for orchestrating LLM-powered programs. It gives LLMs the equivalent of structured programmingโloops, conditionals, variables, function callsโwhile keeping the model strictly out of control flow decisions.
๐ The Core Insight
LLMs fail at long multi-step tasks not because they lack intelligence, but because they hold too much state and make too many implicit decisions. Structured Skills solves this by making the VM own all control flow. The LLM does only two things:
- Decode one line of "vibe" syntax into a machine opcode (via LLM or regex fallback).
- Execute bounded inference when explicitly asked via the
inferkeyword.
The result is a system where complex research, analysis, and automation tasks can be expressed in 10-20 lines of near-English code and executed reliably on small local models.
๐๏ธ Architecture
The execution pipeline has three stages:
.ss file โโโบ Decoder โโโบ Opcodes โโโบ VM
(vibe) (regex+LLM) (IR) (executor)
1. Decoder (src/ss/decoder.py)
Reads a .ss file line by line and translates each line into one or more Opcode objects.
- Structural keywords (
def,if,for,return,import,end,else:) are parsed via regex โ deterministic and fast. - "Vibe" lines (assignments, calls, inference) are sent to an LLM with the
DECODER_PROMPTwhich returns JSON opcodes. If the LLM is unavailable, it falls back to regex.
The decoder collects import statements upfront to provide the LLM with context about available MCP tools.
2. Opcodes (src/ss/opcodes.py)
The instruction set is a 12-opcode IR:
| Opcode | Params | Purpose |
|---|---|---|
| ASSIGN | register, value |
Store a value in a register |
| CALL | name, args, register? |
Call a skill, built-in tool, or MCP tool |
| INFER | prompt, register? |
Send a prompt to the LLM, store result |
| LOOP | item, register |
Iterate over a list |
| END | โ | End of a block (if/loop/def) |
| IF | condition |
Conditional branch |
| ELSE | โ | Else branch |
| JUMP | โ | Unconditional jump (internal) |
| JUMPIF | โ | Conditional jump (internal) |
| DEF | name, params |
Define a skill (function) |
| RETURN | value |
Return from a skill |
| IMPORT | name, source |
Register an MCP server |
| HALT | โ | Stop execution |
3. VM (src/ss/vm.py)
A register-based VM with four runtime structures:
- Registers (
$var): Named blobs holding strings, numbers, lists, or JSON. All data lives here. - Data stack: Used internally for expression evaluation.
- Call stack: Saves return IP and register snapshots when calling skills (
defblocks), enabling nested calls and recursion. - Loop stack: Tracks iteration state (current index, item list, item variable) for
for eachloops.
Program Loading (load_program)
Before execution, the VM makes a single pass to:
- Record skill definitions โ mapping skill names to their
(params, start_ip). - Build jump targets โ pairing
IF/LOOP/ELSEwith their matchingEND, andDEFwith itsEND.
Execution Loop (run)
Walks self.ip through the opcode list, executing each opcode in sequence. The VM owns all control flow โ the LLM never decides whether to branch or loop.
Key Instruction Behaviors
ASSIGN: Resolves the value through evaluate() โ dereferences $register references, parses JSON lists, strips quotes โ and stores it in the target register.
CALL: Checks if the name matches a defined skill. If so, it pushes a frame onto the call stack (saving registers and return IP), maps arguments to skill parameters, and jumps to the skill body. Otherwise, it resolves the call as an MCP tool (if imported) or a built-in tool (read, write, append_to_file, append, add, sum, list_files).
INFER: Replaces $register references in the prompt with their current values, then calls the LLM (or a deterministic mock for testing). Stores the response in the target register.
IF: Evaluates the condition. If falsy, jumps to the matching ELSE or END via the precomputed jump target.
LOOP: On first visit, evaluates the list and initializes a loop state on the loop stack. On subsequent visits, increments the index. When the list is exhausted, pops the loop state and jumps to the matching END.
END: If the matching block start is a LOOP, jumps back to it (creating the cycle). Otherwise, falls through.
DEF: Skips the skill body by jumping to the matching END.
RETURN: Pops the call stack, restores the caller's registers, stores the return value in the target register, and jumps back to the caller.
๐ Quick Start: Bootstrap an Agent
Two CLI tools turn natural language into reusable ss agents:
# Step 1: Bootstrap an agent from a one-shot prompt
agent-create "make a deep research engine"
# โ Generates deep-research.ss with a reusable def research $prompt: skill
# Step 2: Run the agent on a real problem
run-agent deep-research.ss "I need to go from NYC to Chicago, Denver, and Miami in June โ find the cheapest flights and create an itinerary"
# โ ss VM executes the multi-step pipeline and returns the result
How it works
agent-create sends your prompt to an LLM along with a system prompt that teaches it the ss syntax and conventions. The LLM generates a complete .ss script with def skills, tool calls, loops, and inference. The script reads from $prompt and writes its output back to $prompt.
agent-create "make a research agent that searches the web, fetches pages, extracts insights, and writes a report"
# โ research-agent.ss
run-agent prepends $prompt = "<your input>" to the script and runs it through the standard ss pipeline:
run-agent research-agent.ss "Post-quantum cryptography standards 2026"
Example: Deep Research Engine
What agent-create "make a deep research engine" generates:
import brave-search from mcp_servers.json
def research $topic:
$queries = infer "Break '$topic' into 4 search queries. Return as JSON list."
$all_insights = []
for each $query in $queries:
$urls = %brave-search.search $query
for each $url in $urls:
$page = %brave-search.fetch $url
$insight = infer "Extract key insight from: $page"
%append $all_insights $insight
end
end
$report = infer "Synthesize into a report: $all_insights"
return $report
end
$result = %research $prompt
$prompt = $result
Then use it:
run-agent deep-research.ss "I need to go from NYC to Chicago, Denver, and Miami in June โ find the cheapest flights and create an itinerary"
Run scripts directly
You can also run .ss scripts directly:
python3 -m ss.cli myscript.ss
./ss myscript.ss
๐ Vibe Syntax
Scripts use $registers for data, %prefix for tool/skill calls, and infer for LLM inference.
$notes = []
for each $url in $urls:
$page = %brave-search.fetch $url
%append $notes $page
end
$summary = infer "summarize $notes in one paragraph"
%write output.md $summary
Because the decoder uses an LLM for vibe lines, the syntax is flexible โ all of these produce the same CALL opcode:
%websearch "query" -> $result$result = %websearch for "query"do %websearch for "query" and save to $result
๐ ๏ธ Built-in Tools
| Tool | Args | Description |
|---|---|---|
read |
$path |
Read file contents |
write |
$path $data |
Overwrite a file |
append_to_file |
$path $data |
Append to a file |
list_files |
$dir |
List files in a directory |
add |
$a $b |
Add two numbers |
sum |
$list |
Sum a list of numbers |
append |
$list $item |
Append to an in-memory list |
๐ MCP Integration
Tools from external MCP servers (declared in mcp_servers.json) can be imported and called:
import brave-search from mcp_servers.json
$result = %brave-search.search "quantum computing"
๐ ๏ธ Setup
Prerequisites
- Python 3.11+
- direnv (recommended)
Installation
pip install -e .
cp config.toml.example config.toml
# Edit config.toml with your LLM provider
Commands
ss <file.ss> # Run a script directly
ss create <prompt> # Generate an agent script
ss run <file.ss> <prompt> # Run an agent with input
agent-create <prompt> # Generate an agent script
run-agent <file.ss> <prompt> # Run an agent with input
๐งช Testing
./ss tests/test_extraction.ss # Extracts locations from text files
./ss tests/test_math.ss # Arithmetic on numbers read from files
๐ Project Structure
ss Root entry point (shell wrapper)
src/ss/
โโโ agent_create.py LLM-prompted script generator (agent-create)
โโโ agent_runner.py Prepend $prompt and run via VM (run-agent)
โโโ cli.py CLI: reads file, feeds lines to Decoder, loads Program into VM
โโโ decoder.py Regex + LLM decoder: vibe lines โ Opcodes
โโโ vm.py Register-based VM with call/loop stacks and jump targets
โโโ opcodes.py OpcodeType enum and Program model
โโโ prompts.py DECODER_PROMPT template for the LLM decoder
โโโ config.py TOML config loader with section merging
โโโ mcp.py MCP server manager (stdin/stdout-based tool calls)
ss-agent-skill/
โโโ SKILL.md Anthropic Agent Skill for using ss via agent-create/run-agent
examples/
โโโ deep_research.ss Example output of agent-create "make a deep research engine"
tests/
โโโ test_extraction.ss
โโโ test_math.ss
โโโ data/ 5 text files with location data
โโโ data_math/ 3 text files with numbers
config.toml.example LLM configuration template
tutorial.md Extended tutorial and best practices
๐ License
MIT โ April 2026
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file strusky-0.2.0.tar.gz.
File metadata
- Download URL: strusky-0.2.0.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd17cb903ea84fbbb111a0100a6dace885cb0925d1391d1f9b207c41b5a68f1b
|
|
| MD5 |
e7e2d92c5903cfe2a2aa4f3af3f5c96a
|
|
| BLAKE2b-256 |
f50a07cafc0536a3f9aaf8d9930b1cea17abf8547b483d8fb3c156f843d72fed
|
File details
Details for the file strusky-0.2.0-py3-none-any.whl.
File metadata
- Download URL: strusky-0.2.0-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61e117820f14acaa3eb36f23f0d5ee2ef346a589847c4540e31d954acd0ef1c4
|
|
| MD5 |
cb5dd8c8cbf11f478ebf31d773fa9415
|
|
| BLAKE2b-256 |
a60a41ae685ffbbed0d88adab4c189ca4968526efbe9ef3283fca6f2e199ad84
|