Skip to main content

Hardware LLM capability scanner — know what runs on your machine

Project description

tinillm

A local-first LLM agent and hardware scanner — chat with your hardware, let it write code.

pipx install tinillm
tinillm

What it does

tinillm is an interactive agent that runs on your local Ollama models. Type anything at the prompt and it talks to the agent. Type /cmd and it runs a command.

  • Chat with tools. The agent can read files, edit code, run bash, and search the web — all gated by a three-tier permission model (read-only / workspace-write / danger-full-access).
  • Plan mode. /plan puts the model in a read-only planning pass. Approve the plan and it executes autonomously — no per-command confirmations.
  • Hardware-aware. /scan tells you which LLMs fit your machine before you download 30 GB of weights. /models browses real Ollama models with fit analysis.
  • Linux sandbox. When bubblewrap is installed, bash calls run in a namespace jail rooted at your workspace.
  • Session persistence. Every conversation is written to a JSONL log you can /chat --resume.

Install

pipx install tinillm     # recommended: isolated per-tool environment
# or
pip install tinillm

Requires Python 3.11+ and a running Ollama server for the agent. Hardware scanning works without Ollama.

Optional on Linux:

apt install bubblewrap   # enables workspace-rooted bash sandbox

The unified prompt

Everything happens at one prompt. There's no separate "chat mode."

│ refactor agent_chat.py to extract a ChatSession dataclass
⚙  read_file(path='tinillm/run/agent_chat.py')
⚙  write_file(path='tinillm/run/chat_session.py', …)
⚙  edit_file(path='tinillm/run/agent_chat.py', …)
done — extracted ChatSession with create/handle_user_turn/handle_slash.

│ /scan
  LLM Capability Matrix
  ~1B    Perfect   Q8_0     1.9 GB   580 t/s
  ~7B    Perfect   Q6_K     6.2 GB    88 t/s
  ~13B   Perfect   Q5_K_M  10.1 GB    47 t/s

Free text → the agent. /scan, /models, /doctor → Click commands. In-chat commands like /plan, /cost, /rewind, /diff, /tasks operate on the live agent session.


Commands

Agent

Command What it does
(type anything) Send a turn to the agent
/plan Enter plan mode (read-only); produce a numbered plan that auto-executes on approval
/plan-off Leave plan mode without executing
/compact Summarize older turns to free context
/summary Print a summary without touching history
/rewind [N] Remove the last N turns
/diff Show files changed this session
/cost Token + time usage so far
/tasks Current agent TODO list
/load <model> Swap the live model mid-session
/permissions [mode] View or set permission mode
/chat [--resume ID] Re-initialise the agent (or resume a session)

Hardware + models

Command What it does
/scan Scan hardware and show which LLM sizes fit
/scan --verbose Include sizes that don't fit
/scan --json Machine-readable output
/models Browse real Ollama models with fit analysis
/models --fits-only Hide models that don't fit
/models --ollama Show which models are locally installed
/run [tag] Launch a model directly in Ollama
/suggest --use-case coding Personalised recommendation

System

Command What it does
/doctor Health check (hardware, Ollama, sandbox)
/memory show · /remember · /forget Persistent per-project memory
/sessions List resumable chat sessions
/skills List installed agent skills
/init Scaffold a TINILLM.md for this repo
/help List every command
/clear Clear the terminal
/exit Quit (Ctrl+D also works)

Tab-completion works on every slash command, subcommand, and flag.


Plan mode

│ /plan
plan mode ON — read-only. Describe your goal.

│ add a --json flag to the scan command
Plan:
1. read render/scan.py to find the current output path
2. add --json flag in commands.py scan_cmd
3. route to a JSON renderer, gated on the flag
4. add a test in tests/test_scan_cmd.py

plan auto-accepted — executing in workspace-write mode
⚙  read_file(path='tinillm/render/scan.py')
⚙  edit_file(path='tinillm/commands.py', …)
⚙  write_file(path='tinillm/render/scan_json.py', …)
done.

Once the plan is approved, destructive bash and write-tool calls run without confirmation prompts. Autonomy is scoped to the one execution phase — the next user turn gets a fresh confirm-on-destructive policy.


Permissions

Three modes, set per-session:

Mode Reads Workspace writes Shell Network
read-only
workspace-write (default) confirm on destructive
danger-full-access

Switch mid-session with /permissions read-only.


Sandbox

On Linux with bubblewrap installed, every bash call is wrapped with bwrap --unshare-net --bind <workspace> / --ro-bind /usr /usr … — reads are confined to the workspace, writes can't escape it, and network is off by default. sandbox_allow_network=true in .tinillmrc opens it.

macOS / Windows: sandbox is a no-op for now; permission enforcer and workspace resolution still apply.

Check status with /doctor or the welcome panel.


Configuration

Per-project: .tinillmrc (TOML) in the repo root:

model = "qwen2.5-coder:14b"
permission_mode = "workspace-write"
auto_accept_plan = true
sandbox_enabled = true
sandbox_allow_network = false
allowed_tools = ["read_file", "write_file", "edit_file", "bash"]
denied_tools = []

Per-user: ~/.tinillm/settings.json (written by /load).


GPU support

Vendor Detection method
NVIDIA nvidia-smi → sysfs fallback
AMD rocm-smi → sysfs fallback
Apple Silicon system_profiler (unified memory)
Intel Arc sysfs + lspci
Windows (all) PowerShell WMI
Any vulkaninfo last-resort fallback

Fit levels

Level Meaning
Perfect Fits comfortably at Q4_K_M or better with ≥20% headroom
Good Fits but tightly
Marginal Only at heavy compression / reduced context, or CPU-only
TooTight Won't fit under any quantisation

Versioning

Version Feature
2.4 Unified REPL · plan autonomy · sandbox surfacing ← current
2.3 Plan mode · auto-accept · session JSONL
2.2 Agent tool calling · three-tier permissions
2.1 Linux bubblewrap sandbox
1.9 Hardware scanning + model runner
1.8 Interactive REPL, slash commands
1.1 First release — hardware scanner

Part of the tini* family

Tool What it does
tiniRAG Privacy-first RAG CLI
tinillm Local LLM agent + hardware scanner

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinillm-2.5.0.tar.gz (127.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinillm-2.5.0-py3-none-any.whl (116.2 kB view details)

Uploaded Python 3

File details

Details for the file tinillm-2.5.0.tar.gz.

File metadata

  • Download URL: tinillm-2.5.0.tar.gz
  • Upload date:
  • Size: 127.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.5.0.tar.gz
Algorithm Hash digest
SHA256 19733f6a1e7821fa9b23181c6370a6fbd3fa85ffb70be4895ec179011087f2d7
MD5 29cee8c3684136e3ec8a16ce99917bb4
BLAKE2b-256 895973292210061414f7342ed9f8106876837d4dbc63ff6bc33efdc7256a78ef

See more details on using hashes here.

File details

Details for the file tinillm-2.5.0-py3-none-any.whl.

File metadata

  • Download URL: tinillm-2.5.0-py3-none-any.whl
  • Upload date:
  • Size: 116.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a29e7880035fd1d06cfadd93715adf2d5cd99143d992b334901c6146ec5c873
MD5 d1306ed0edbf8ab19a6e85ffd2420d78
BLAKE2b-256 6c7cd6cf764b8ba434c12c42d0e8578fdbc0882d1afd3c6baa201b51b3fb59bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page