Runtime semiformal system: @semiformal and semi() for semantically underspecified logic
Project description
semipy
Write Python logic in natural language. semipy generates, validates, and caches real Python functions from your specs — no boilerplate, no hardcoded rules.
from semipy import semiformal, semi, configure
configure(openai_api_key="sk-...") # or set OPENAI_API_KEY in environment
@semiformal
def parse_log_line(line: str) -> dict:
#> extract timestamp, level, and message from an Apache log line
result = ...
return result
# First call: agent generates a Python function and caches it.
# All later calls: load the cached implementation instantly.
row = parse_log_line('127.0.0.1 - - [10/Oct/2000:13:55:36] "GET /index.html" 200')
Install
pip install semipy
For Jupyter notebook display:
pip install "semipy[jupyter]"
For PDF input materialization:
pip install "semipy[pdf]"
API key
semipy uses the OpenAI API for code generation. Set your key in one of three ways:
Environment variable (recommended for scripts):
export OPENAI_API_KEY=sk-...
.env file at the project root (loaded automatically):
OPENAI_API_KEY=sk-...
In code (overrides env):
from semipy import configure
configure(openai_api_key="sk-...")
Configuration
from semipy import configure
configure(
openai_api_key="sk-...", # defaults to OPENAI_API_KEY env var
openai_model="gpt-4o", # generation model (default: gpt-5.4)
verbose=True, # rich terminal output during generation (default: True)
cache_dir=".semiformal", # where portal JSON and dispatch modules are stored
max_retries=3, # agent retry limit on validation failure
session_source=None, # pin portal identity (useful for Jupyter; see below)
)
All fields are optional — call configure() with only what you want to override.
Terminal output
When verbose=True (the default), semipy prints a live Rich panel showing the agent's reasoning, tool calls, and generated code as they stream in. This works in both terminal and Jupyter.
Implementing code...
─────────────────────────────────────────────────
Reasoning The function needs to parse a standard Apache
Combined Log Format line...
─────────────────────────────────────────────────
Tool build_and_run_gist passed
─────────────────────────────────────────────────
Reusing cached implementation; runtime verify passed.
parse_log_line GENERATE a1b2c3d4 examples/logs.py:12
To silence all output: configure(verbose=False).
To see full prompt, decision, and tool-call dumps, set the environment variable:
export SEMIPY_PIPELINE_TRACE=1
Usage patterns
@semiformal with #> spec blocks
from semipy import semiformal
@semiformal
def extract_fields(record: str) -> dict:
#> extract date, sender, and subject from an email header
result = ...
return result
The #> block is the spec. The result = ... is the slot anchor — the agent fills it in.
Inline semi() for expressions
from semipy import semiformal, semi
@semiformal
def classify_rows(rows):
return [r for r in rows if semi(f"is {repr(r['status'])} a client error?")]
Standalone semi() in any function
from semipy import semi
def process(text):
label = semi(f"classify '{text}' as positive, negative, or neutral")
return label
Multiple slots in one function
from semipy import semiformal
@semiformal
def analyze(entry: str) -> dict:
#> extract the IP address from the log entry
ip = ...
#> determine if the HTTP status code in the entry indicates an error
is_error = ...
return {"ip": ip, "error": is_error}
Reasoning surface (#< lines)
After each generation, semipy writes a small set of #< comment lines around each slot anchor. These are not part of your spec — they are system-managed traces that describe what the generated implementation decided and why, so you can read and steer it without digging into the cache.
@semiformal
def infer_datetime_formatter(date_str: str) -> str:
#< intent: infer strptime pattern from observed date text
#< by: probing a regex-gated candidate table; because the table covers all observed formats
#< unless: empty or unmatched input raises ValueError
#> verified: 'Mar 2025' -> {'input_pattern':'%b %Y'}, return error if no match
input_pattern = ... #> infer the input date regex/strptime pattern from the observed string format in this session.
output_pattern = "%b %Y"
return datetime.strptime(str(date_str), input_pattern).strftime(output_pattern)
Placement
#< lines appear in two zones around the slot anchor:
- Zone P (provenance, above the anchor):
intent,given,by,unless— why this implementation exists, what it assumes, and how it handles failure. - Zone E (effect, below the anchor):
yields,verified— what the generated code produces and what was observed at runtime.
Keywords
| Key | Zone | Meaning |
|---|---|---|
intent |
above | One-phrase task summary (emitted only when the spec is long or ambiguous). |
given |
above | Input-shape assumptions beyond the signature (multi-param slots only). |
by |
above | Strategy/mechanism this implementation uses. Embed the reason inline when the choice is non-obvious: <strategy>; because <reason>. Always present. |
unless |
above | Fallback or exceptional path (emitted only when the generated code has a raise/except). May repeat for distinct failure modes. |
yields |
below | Output shape beyond the return annotation (skipped for simple builtins like str, int). |
verified |
below | Sample input → observed output, derived from the validation run (never LLM-generated). |
Steering
To change what the next generation produces, edit the #< by: line (strategy) or the #< unless: line (exceptional path). On the next run where the implementation needs to change, semipy reads the override and adapts accordingly.
Promoting a constraint
To lock an inference note into the contract permanently, flip #< to #> on the same line. This extends the spec text, causing a new ADAPT on the next run, and suppresses the duplicate #< line from being re-emitted.
# Before: system-managed trace
#< by: probing a regex-gated candidate table; because the table covers all observed formats
# After: promoted to user contract — fixes the strategy for future runs
#> by: regex-gated candidate table
#< lines are stable across runs: semipy only rewrites them when the generated implementation changes. A no-op re-run produces a byte-identical file.
Caching and reuse
Generated functions are stored in .semiformal/ relative to your working directory:
.semiformal/<session>.portal.json— versioned DAG of all commits, branches, and decisions..semiformal/runtime/<module>.semi.py— compiled Python implementations for import.
On subsequent runs, semipy loads the cached implementation without calling the LLM. It re-verifies when it detects new input shapes, and runs ADAPT (re-generation from the prior commit) when verification fails.
To force regeneration, delete .semiformal/ or the relevant portal file.
Jupyter
In Jupyter notebooks, semipy detects the ipykernel environment automatically. The portal is keyed to os.getcwd() so one portal persists across kernel restarts.
Install the optional display extras for inline Rich output:
pip install "semipy[jupyter]"
If multiple notebooks share a working directory and need separate caches:
configure(session_source="/path/to/my_notebook.ipynb")
VS Code extension
The Semipy VS Code extension adds:
- Syntax highlighting for
#>spec lines (teal) and#<reasoning lines (green) - Slot history tree in the Explorer panel
- Split-view to inspect generated
.semi.pyalongside your source - Inlay hints and CodeLens showing commit id and decision
Examples
See the examples/ directory:
examples/apache_log_semiformal_stages.py— staged walkthrough from simple extraction to INSTANTIATE and promotion workflowexamples/datetime_test.py— datetime format inference with observed samples and steered#<surfaceexamples/fasta_header_metadata.py— FASTA header parsing with multiple slots, nested anchors, and standalonesemi()callsexamples/use_contract_intelligence.py— contract field extraction from PDFexamples/use_sponsorship_canonicalizer.py— entity canonicalization pipeline
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semiformal_py-0.2.0.tar.gz.
File metadata
- Download URL: semiformal_py-0.2.0.tar.gz
- Upload date:
- Size: 156.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
829486c394771c6961385890e7eda102d118a2fbb552eacb739c3e17b361987f
|
|
| MD5 |
b555e4207e6affb414b94e5022824990
|
|
| BLAKE2b-256 |
907b8bdad4407ca273993ee2d1c48e6a85a6c283b93b527eaba1f5473f85058b
|
File details
Details for the file semiformal_py-0.2.0-py3-none-any.whl.
File metadata
- Download URL: semiformal_py-0.2.0-py3-none-any.whl
- Upload date:
- Size: 186.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fca199890db0c24e8600250b6140d22ce7877f434cad57c01cb41794a632d269
|
|
| MD5 |
9862ce835730ea2766f9d90d8fdef123
|
|
| BLAKE2b-256 |
1628b794c5078031c815c47f28a6a5ef4273c852e3c6fc73aa2098868631574e
|