Hardware LLM capability scanner — know what runs on your machine

Project description

tinillm

A local-first LLM agent and hardware scanner — chat with your hardware, let it write code.

pipx install tinillm
tinillm

What it does

tinillm is an interactive agent that runs on your local Ollama models. Type anything at the prompt and it talks to the agent. Type /cmd and it runs a command.

Chat with tools. The agent can read files, edit code, run bash, and search the web — all gated by a three-tier permission model (read-only / workspace-write / danger-full-access).
Plan mode. /plan puts the model in a read-only planning pass. Approve the plan and it executes autonomously — no per-command confirmations.
Hardware-aware. /scan tells you which LLMs fit your machine before you download 30 GB of weights. /models browses real Ollama models with fit analysis.
Linux sandbox. When bubblewrap is installed, bash calls run in a namespace jail rooted at your workspace.
Session persistence. Every conversation is written to a JSONL log you can /chat --resume.

Install

pipx install tinillm     # recommended: isolated per-tool environment
# or
pip install tinillm

Requires Python 3.11+ and a running Ollama server for the agent. Hardware scanning works without Ollama.

Optional on Linux:

apt install bubblewrap   # enables workspace-rooted bash sandbox

The unified prompt

Everything happens at one prompt. There's no separate "chat mode."

│ refactor agent_chat.py to extract a ChatSession dataclass
⚙  read_file(path='tinillm/run/agent_chat.py')
⚙  write_file(path='tinillm/run/chat_session.py', …)
⚙  edit_file(path='tinillm/run/agent_chat.py', …)
done — extracted ChatSession with create/handle_user_turn/handle_slash.

│ /scan
  LLM Capability Matrix
  ~1B    Perfect   Q8_0     1.9 GB   580 t/s
  ~7B    Perfect   Q6_K     6.2 GB    88 t/s
  ~13B   Perfect   Q5_K_M  10.1 GB    47 t/s

Free text → the agent. /scan, /models, /doctor → Click commands. In-chat commands like /plan, /cost, /rewind, /diff, /tasks operate on the live agent session.

Commands

Agent

Command	What it does
(type anything)	Send a turn to the agent
`/plan`	Enter plan mode (read-only); produce a numbered plan that auto-executes on approval
`/plan-off`	Leave plan mode without executing
`/compact`	Summarize older turns to free context
`/summary`	Print a summary without touching history
`/rewind [N]`	Remove the last N turns
`/diff`	Show files changed this session
`/cost`	Token + time usage so far
`/tasks`	Current agent TODO list
`/load <model>`	Swap the live model mid-session
`/permissions [mode]`	View or set permission mode
`/chat [--resume ID]`	Re-initialise the agent (or resume a session)

Hardware + models

Command	What it does
`/scan`	Scan hardware and show which LLM sizes fit
`/scan --verbose`	Include sizes that don't fit
`/scan --json`	Machine-readable output
`/models`	Browse real Ollama models with fit analysis
`/models --fits-only`	Hide models that don't fit
`/models --ollama`	Show which models are locally installed
`/run [tag]`	Launch a model directly in Ollama
`/suggest --use-case coding`	Personalised recommendation

System

Command	What it does
`/doctor`	Health check (hardware, Ollama, sandbox)
`/memory show` · `/remember` · `/forget`	Persistent per-project memory
`/sessions`	List resumable chat sessions
`/skills`	List installed agent skills
`/init`	Scaffold a `TINILLM.md` for this repo
`/help`	List every command
`/clear`	Clear the terminal
`/exit`	Quit (Ctrl+D also works)

Tab-completion works on every slash command, subcommand, and flag.

Plan mode

│ /plan
plan mode ON — read-only. Describe your goal.

│ add a --json flag to the scan command
Plan:
1. read render/scan.py to find the current output path
2. add --json flag in commands.py scan_cmd
3. route to a JSON renderer, gated on the flag
4. add a test in tests/test_scan_cmd.py

plan auto-accepted — executing in workspace-write mode
⚙  read_file(path='tinillm/render/scan.py')
⚙  edit_file(path='tinillm/commands.py', …)
⚙  write_file(path='tinillm/render/scan_json.py', …)
done.

Once the plan is approved, destructive bash and write-tool calls run without confirmation prompts. Autonomy is scoped to the one execution phase — the next user turn gets a fresh confirm-on-destructive policy.

Permissions

Three modes, set per-session:

Mode	Reads	Workspace writes	Shell	Network
`read-only`	✓	—	—	—
`workspace-write` (default)	✓	✓	confirm on destructive	—
`danger-full-access`	✓	✓	✓	✓

Switch mid-session with /permissions read-only.

Sandbox

On Linux with bubblewrap installed, every bash call is wrapped with bwrap --unshare-net --bind <workspace> / --ro-bind /usr /usr … — reads are confined to the workspace, writes can't escape it, and network is off by default. sandbox_allow_network=true in .tinillmrc opens it.

macOS / Windows: sandbox is a no-op for now; permission enforcer and workspace resolution still apply.

Check status with /doctor or the welcome panel.

Configuration

Per-project: .tinillmrc (TOML) in the repo root:

model = "qwen2.5-coder:14b"
permission_mode = "workspace-write"
auto_accept_plan = true
sandbox_enabled = true
sandbox_allow_network = false
allowed_tools = ["read_file", "write_file", "edit_file", "bash"]
denied_tools = []

Per-user: ~/.tinillm/settings.json (written by /load).

GPU support

Vendor	Detection method
NVIDIA	`nvidia-smi` → sysfs fallback
AMD	`rocm-smi` → sysfs fallback
Apple Silicon	`system_profiler` (unified memory)
Intel Arc	sysfs + `lspci`
Windows (all)	PowerShell WMI
Any	`vulkaninfo` last-resort fallback

Fit levels

Level	Meaning
Perfect	Fits comfortably at Q4_K_M or better with ≥20% headroom
Good	Fits but tightly
Marginal	Only at heavy compression / reduced context, or CPU-only
TooTight	Won't fit under any quantisation

Versioning

Version	Feature
2.4	Unified REPL · plan autonomy · sandbox surfacing ← current
2.3	Plan mode · auto-accept · session JSONL
2.2	Agent tool calling · three-tier permissions
2.1	Linux bubblewrap sandbox
1.9	Hardware scanning + model runner
1.8	Interactive REPL, slash commands
1.1	First release — hardware scanner

Part of the tini* family

Tool	What it does
tiniRAG	Privacy-first RAG CLI
tinillm	Local LLM agent + hardware scanner

Project details

Release history Release notifications | RSS feed

2.6.0

Apr 21, 2026

2.5.0

Apr 20, 2026

This version

2.4.0

Apr 20, 2026

2.3.0

Apr 20, 2026

2.1.1

Apr 20, 2026

2.1.0

Apr 20, 2026

1.5.1 yanked

Apr 14, 2026

1.5.0 yanked

Apr 14, 2026

1.4.0 yanked

Apr 14, 2026

1.3.0 yanked

Apr 14, 2026

1.2.0 yanked

Apr 14, 2026

1.1.0 yanked

Apr 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinillm-2.4.0.tar.gz (127.0 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tinillm-2.4.0-py3-none-any.whl (115.4 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file tinillm-2.4.0.tar.gz.

File metadata

Download URL: tinillm-2.4.0.tar.gz
Upload date: Apr 20, 2026
Size: 127.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.4.0.tar.gz
Algorithm	Hash digest
SHA256	`5eda83299fa68e271c497987190963b72512a487b976f3c4477136ad53ea0b51`
MD5	`103e3db109dfe210ef460dc6b5c0fed4`
BLAKE2b-256	`b69701bdb118d168e787140502f66006c41e13064d4d764d67ac53d1a14c407c`

See more details on using hashes here.

File details

Details for the file tinillm-2.4.0-py3-none-any.whl.

File metadata

Download URL: tinillm-2.4.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 115.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f871272805eef5dd3119ff9ee9d5fd2cfff8329e9aa669ecef94260f2e96f347`
MD5	`6fc17840e597457c43f9ecf79d7cda45`
BLAKE2b-256	`6392a4d237d9b41813542adc3a46962ee178f31f38a428e946dfa4eb73491320`

See more details on using hashes here.

tinillm 2.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

tinillm

What it does

Install

The unified prompt

Commands

Agent

Hardware + models

System

Plan mode

Permissions

Sandbox

Configuration

GPU support

Fit levels

Versioning

Part of the tini* family

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes