A library for spec-based, validated R execution from Python via a bridge (rpy2), with optional LLM planning and CLI commands.
Project description
pyllmr
Define and validate specs from Python to R for calling functions via a bridge (rpy2): search/install packages, inspect docs/functions, run reproducible calls, with an optional LLM planner and CLI.
What it does
- Search R packages on CRAN by keyword(s)
- List / find / doc / exported functions for installed R packages
- Install / update R packages from a chosen CRAN repo
- Inspect installed Python packages (local environment metadata only)
- Plan a reproducible execution spec from natural language (LLM)
- Execute a spec deterministically (no LLM required)
- Emit a Python runner script to replay a call/spec
The library pyllmr may be used as a structured call/execution layer for automated LLM-driven reporting (e.g., “plan, review, execute” pipelines), but it does not guarantee correctness of model outputs for this version. It may be used for learning about r functions for python users, and used for testing stuctured runs from Python code or cli. It was iteratively developed using modern tooling for coding, docstrings, and debugging. This requires a working R installation (available on PATH / R_HOME).
This project is provided "as is", without warranty of any kind. You are responsible for reviewing any generated spec/code and for running it in a safe environment. This provides a deterministic local runtime to validate and execute R call specs. LLM-based planning is provided for convenience and should be reviewed before execution (use “plan, review, execute”) or the plan schema written by the human if required.
Install
From PyPI
pip install pyllmr
From source
pip install -e .
Compatibility for version v0.1.0
- Package currently tested on Windows 11 only.
- Linux or macOS are not supported yet for now.
Quick sanity
pyllmr --version
pyllmr --checkenv
CLI overview
pyllmr --help
The CLI is organized as:
- R discovery & maintenance (CRAN + installed R)
- Python discovery (installed env only)
- Repro flows (spec JSON →
--exec-spec, optional--emit-py) - LLM flows (
--plan,--call, optional--explain) - I/O & formatting (
--json,--tsv,--out,--out-spec,--file) - Diagnostics (
--checkenv,--verbose)
1) R: search, list, find, docs, functions
1.1 Search CRAN
pyllmr --search-r excel
pyllmr --search-r excel --limit 15
pyllmr --search-r excel --json
pyllmr --search-r excel --tsv
pyllmr --search-r excel --installed
pyllmr --search-r excel --installed-only
pyllmr --search-r clustering --limit 30 --installed
1.2 List installed R packages
pyllmr --list-r
pyllmr --list-r --json
1.3 Find by exact name (basic metadata)
pyllmr --find-r readxl
pyllmr --find-r readxl --json
1.4 Show R package documentation
pyllmr --doc dplyr
1.5 List exported R functions
pyllmr --funs-r dplyr
pyllmr --funs-r stats
2) R: check / install
2.1 Check presence/versions
pyllmr --check dplyr readxl ggplot2
2.2 Install R packages (CRAN)
pyllmr --install-r readxl
pyllmr --install-r dplyr tidyr ggplot2
pyllmr --install-r readxl --repos https://cloud.r-project.org
3) Python: installed-only discovery
--search-py searches installed packages in the current environment using local env metadata (not online PyPI).
3.1 Search installed Python packages by keywords
pyllmr --search-py excel
pyllmr --search-py excel --limit 20
pyllmr --search-py excel --json --limit 20
pyllmr --search-py excel --tsv --limit 20
3.2 List installed Python packages
pyllmr --list-py
3.3 List public functions for a Python module
pyllmr --funs-py pandas
pyllmr --funs-py scipy.stats
pyllmr --funs-py json
4) Reproducible execution (NO LLM): spec JSON → --exec-spec
This is the “manual plan” path:
- you create a spec JSON
pyllmr --exec-specexecutes it locally via R
4.1 Minimal spec: t.test
Create the spec:
python -c "import json;json.dump({'package':'stats','function':'t.test','inputs':{'x':{'value':[1,2,3]},'y':{'value':[2,3,4]}},'kwargs':{}},open('spec_ttest.json','w'))"
Run it:
pyllmr --exec-spec spec_ttest.json
pyllmr --exec-spec spec_ttest.json --json
pyllmr --exec-spec spec_ttest.json --verbose
4.2 Matrix spec: chisq.test
Create the spec:
python -c "import json;json.dump({'package':'stats','function':'chisq.test','inputs':{'x':{'matrix':{'data':[[10,20,30],[15,25,35]],'byrow':True}}},'kwargs':{}},open('spec_chisq_matrix.json','w'))"
Run it:
pyllmr --exec-spec spec_chisq_matrix.json --json
4.3 Literal expression spec (if supported in your build)
Create the spec:
python -c "import json;json.dump({'package':'base','function':'sqrt','inputs':{'x':{'expr':'1+2*3'}},'kwargs':{}},open('spec_expr.json','w'))"
Run it:
pyllmr --exec-spec spec_expr.json --json
5) Files: whitelist local files with --file
When a spec references file paths (e.g. CSV wrapper), pass the file with --file.
Create the spec:
python -c "import json;json.dump({'package':'utils','function':'head','inputs':{'x':{'csv':{'path':'iris.csv','sep':',','encoding':'utf-8'}}},'kwargs':{'n':6}},open('spec_readcsv.json','w'))"
Execute (with file attached):
pyllmr --file iris.csv --exec-spec spec_readcsv.json --json
6) LLM planning/execution: --plan / --call
Requires your model/provider configuration (e.g.
OPENAI_API_KEY).
6.1 Plan only (returns a spec)
pyllmr --plan stats "Run a t-test between x=[1,2,3] and y=[2,3,4]"
pyllmr --plan stats "Compute mean of x=[1,2,3,4,5]"
pyllmr --plan stats "Compute sd of x=[1,2,3,4,5]"
pyllmr --plan stats "Compute quantiles of x=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30] with probs=[0.25,0.5,0.75]"
Save the generated spec to a file:
pyllmr --plan stats "Run a t-test between x=[1,2,3] and y=[2,3,4]" --out-spec spec_from_llm.json
pyllmr --exec-spec spec_from_llm.json
6.2 Call (plan + execute)
pyllmr --unsafe --call stats "Run a t-test between x=[1,2,3] and y=[2,3,4]"
pyllmr --unsafe --call stats "Compute mean of x=[1,2,3,4,5]"
pyllmr --unsafe --call stats "Compute sd of x=[1,2,3,4,5]"
pyllmr --unsafe --call stats "Compute Pearson correlation between x=[1,2,3,4,5] and y=[2,1,4,3,5]"
pyllmr --unsafe --call stats "Fit a linear model y=[3.2,3.9,5.1,5.8,7.2] as a function of x=[1,2,3,4,5]"
pyllmr --unsafe --call stats "Run one-way ANOVA for y=[5,6,7,5,6,8,9,7,6] across group=['A','A','A','B','B','B','C','C','C']"
pyllmr --unsafe --call stats "Fit logistic regression for y=[0,0,1,0,1,1,0,1] with predictors x=[1,2,3,4,5,6,7,8]"
6.3 Show generated R code without executing
pyllmr --unsafe --call stats "Run a t-test between x=[1,2,3] and y=[2,3,4]" --show-r
pyllmr --unsafe --call stats "chisq test with matrix x=[[10,20,30],[15,25,35]]" --show-r
6.4 Explain (optional)
pyllmr --unsafe --call stats "Run a t-test between x=[1,2,3] and y=[2,3,4]" --explain
6.5 Override models
pyllmr --plan stats "Compute mean of x=[1,2,3,4,5]" --model gpt-4o-mini
pyllmr --unsafe --call stats "Compute mean of x=[1,2,3,4,5]" --model gpt-4o-mini
pyllmr --unsafe --call stats "Run a t-test between x=[1,2,3] and y=[2,3,4]" --explain --model-explain gpt-4o-mini
7) Emit a reproducible Python runner: --emit-py
Emit a runner script (replays the call/spec):
pyllmr --plan stats "Run a t-test between x=[1,2,3] and y=[2,3,4]" --emit-py run_call_ttest.py
pyllmr --unsafe --call stats "chisq test with matrix x=[[10,20,30],[15,25,35]]" --emit-py run_call_chisq.py
8) Output / formats / I/O flags
8.1 Machine-friendly output
pyllmr --search-r excel --json
pyllmr --search-r excel --tsv
8.2 Write outputs to files
pyllmr --search-r excel --json --out out.json
pyllmr --plan stats "Compute mean of x=[1,2,3]" --out-spec spec.json
8.3 Redaction (best-effort)
pyllmr --redact --plan stats "Run a t-test between x=[1,2,3] and y=[2,3,4]"
8.4 Attach local files as context
pyllmr --file notes.txt --plan stats "Run a t-test between x=[1,2,3] and y=[2,3,4]"
pyllmr --file data.csv --file schema.json --call stats "Compute correlation between x and y using data.csv"
9) Diagnostics / debugging
9.1 Environment diagnostics
pyllmr --checkenv
pyllmr --checkenv --verbose
9.2 Verbose logs
pyllmr --verbose --search-r excel
10) LLM provider/model overrides (advanced)
This section shows how to override the LLM model and provider explicitly. Use it when you want fast planning, a different model for explanations, or a local provider (e.g., Ollama).
LLM planning currently uses the OpenAI Python client and also supports OpenAI-compatible endpoints (e.g. local gateways) via environment variables like OPENAI_BASE_URL and OPENAI_API_KEY.
10.1 Override models per command
pyllmr --plan stats "mean of x=[1,2,3]" --model gpt-4o-mini --out-spec plan_mean.json
pyllmr --exec-spec plan_mean.json
pyllmr --unsafe --call stats "mean of x=[1,2,3]" --model gpt-4o-mini --show-r
pyllmr --unsafe --call stats "mean of x=[1,2,3]" --model gpt-4o-mini --explain --model-explain gpt-4o
10.2 Set default models via environment variables (PowerShell)
$env:PYLLMR_MODEL="gpt-4o-mini"
$env:PYLLMR_MODEL_EXPLAIN="gpt-4o"
pyllmr --checkenv
10.3 OpenAI provider (PowerShell)
$env:PYLLMR_PROVIDER="openai"
$env:OPENAI_API_KEY="..."
pyllmr --checkenv
10.4 OpenAI-compatible endpoint (proxy/self-host) (PowerShell)
$env:PYLLMR_PROVIDER="openai"
$env:PYLLMR_BASE_URL="http://localhost:1234/v1"
$env:OPENAI_API_KEY="..."
pyllmr --checkenv
10.5 Ollama (local) example (PowerShell)
Example with a local Ollama server. Adjust the model name to one you have pulled (e.g., llama3.1:8b).
$env:PYLLMR_PROVIDER="ollama"
$env:OLLAMA_HOST="http://localhost:11434"
$env:PYLLMR_MODEL="llama3.1:8b"
$env:PYLLMR_MODEL_EXPLAIN="llama3.1:8b"
pyllmr --checkenv
pyllmr --plan stats "mean of x=[1,2,3]" --out-spec plan_mean_ollama.json
pyllmr --exec-spec plan_mean_ollama.json
pyllmr --unsafe --call stats "mean of x=[1,2,3]" --show-r
Notes:
--modelcontrols the model used for--plan/--call.--model-explaincontrols the model used for--explain.pyllmr --checkenvprints the detected provider/model configuration (it does not change them).
11) Security model and recommended workflow (avoid direct execution without file checking)
--exec-specruns local execution (R) from a JSON spec. Treat specs as code: only run specs you trust.--fileis a whitelist: attach only files you want the tool to access.--redactis best-effort redaction before sending context to the model. Do not rely on it for strict secrecy.- LLM features can generate code/specs. Review with
--show-rand/or use--out-spec+--exec-specfor a controlled workflow.
Recommended safe workflow:
--plan ... --out-spec spec.json- inspect spec (and optionally
--show-r) - run
--exec-spec spec.json
12) Troubleshooting
R not found / rpy2 errors
- Make sure
R(RStudio) is installed and accessible from the same shell where you runpyllmr. - If needed, set
R_HOMEto your R installation directory.
Windows tips
- Prefer running in a clean venv.
- If you see encoding errors, run with
--verboseand verify your console/codepage settings.
Python calls overview examples
import json
from pyllmr import RCallSpec, validate_spec, execute_spec
def main() -> None:
spec = RCallSpec(
package="stats",
function="t.test",
inputs={"x": {"value": [1, 2, 3]}, "y": {"value": [2, 3, 4]}},
kwargs={},
)
allowlist = {"stats": ["t.test"]}
validate_spec(spec, files=set(), allowlist=allowlist)
out = execute_spec(spec, file_paths={}, verbose=False)
print(json.dumps(out, ensure_ascii=False, indent=2))
if __name__ == "__main__":
main()
import json
from pyllmr import run
def main() -> None:
allowlist = {"stats": ["cor.test", "cor", "t.test", "lm"]}
out = run(
prompt="Compute Pearson correlation between x=[1,2,3,4,5] and y=[2,1,4,3,5].",
file_paths={},
allowlist=allowlist,
model="gpt-4o-mini",
)
print(json.dumps(out, ensure_ascii=False, indent=2))
if __name__ == "__main__":
main()
Requirements
- Python 3.10+ (recommended)
- R installed and available on PATH (or configured with
R_HOME) - For LLM features (
--plan,--call,--explain): provider env vars (e.g.OPENAI_API_KEY)
This project requires R (see also RStudio). Without R, the tool is mostly useless by design.
Warning (Safety note)
The recommended workflow is plan → review → execute (--plan --out-spec ... then --exec-spec ...).
Direct execution via --call requires --unsafe. Prefer --plan + inspect the file/spec JSON + run --exec-spec.
TO DO NEXT
- Extend file and variable inputs as first-class arguments for R function calls.
- Publish a JSON Schema for
RCallSpecand validate every spec against it. - Add a small catalog of schemas for common R functions (discoverable + retrievable).
- Support YAML specs with lossless conversion to/from JSON for human-friendly reviews.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyllmr-0.1.0.tar.gz.
File metadata
- Download URL: pyllmr-0.1.0.tar.gz
- Upload date:
- Size: 51.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c4ce0dba70e013750c4e6f67e7624740c9fe617259b9dee9d004f9cafd76796
|
|
| MD5 |
f09ea835b2ba0e842673c464c2795137
|
|
| BLAKE2b-256 |
b65eb88e42e761793d8de75dd14ed6cc4b963717527ac656db69a8d8c66f138d
|
File details
Details for the file pyllmr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyllmr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 50.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e06e4e30fbbeb040c5893f103d367097cb33a7ad1775d0acd5242caca177301
|
|
| MD5 |
f2df890f15b14a6c1c4e23af1da9a5a1
|
|
| BLAKE2b-256 |
5ed7610ef3b519c231998aabf3037a7f6e58a393848f8ee2c93e65f41a77a5ad
|