Skip to main content

Hardware LLM capability scanner — know what runs on your machine

Project description

tinillm

A local-first LLM agent and hardware scanner — chat with your hardware, let it write code.

pipx install tinillm
tinillm

What it does

tinillm is an interactive agent that runs on your local Ollama models. Type anything at the prompt and it talks to the agent. Type /cmd and it runs a command.

  • Chat with tools. The agent can read files, edit code, run bash, and search the web — all gated by a three-tier permission model (read-only / workspace-write / danger-full-access).
  • Plan mode. /plan puts the model in a read-only planning pass. Approve the plan and it executes autonomously — no per-command confirmations.
  • Hardware-aware. /scan tells you which LLMs fit your machine before you download 30 GB of weights. /models browses real Ollama models with fit analysis.
  • Linux sandbox. When bubblewrap is installed, bash calls run in a namespace jail rooted at your workspace.
  • Session persistence. Every conversation is written to a JSONL log you can /chat --resume.

Install

pipx install tinillm     # recommended: isolated per-tool environment
# or
pip install tinillm

Requires Python 3.11+ and a running Ollama server for the agent. Hardware scanning works without Ollama.

Optional on Linux:

apt install bubblewrap   # enables workspace-rooted bash sandbox

The unified prompt

Everything happens at one prompt. There's no separate "chat mode."

│ refactor agent_chat.py to extract a ChatSession dataclass
⚙  read_file(path='tinillm/run/agent_chat.py')
⚙  write_file(path='tinillm/run/chat_session.py', …)
⚙  edit_file(path='tinillm/run/agent_chat.py', …)
done — extracted ChatSession with create/handle_user_turn/handle_slash.

│ /scan
  LLM Capability Matrix
  ~1B    Perfect   Q8_0     1.9 GB   580 t/s
  ~7B    Perfect   Q6_K     6.2 GB    88 t/s
  ~13B   Perfect   Q5_K_M  10.1 GB    47 t/s

Free text → the agent. /scan, /models, /doctor → Click commands. In-chat commands like /plan, /cost, /rewind, /diff, /tasks operate on the live agent session.


Commands

Agent

Command What it does
(type anything) Send a turn to the agent
/plan Enter plan mode (read-only); produce a numbered plan that auto-executes on approval
/plan-off Leave plan mode without executing
/compact Summarize older turns to free context
/summary Print a summary without touching history
/rewind [N] Remove the last N turns
/diff Show files changed this session
/cost Token + time usage so far
/tasks Current agent TODO list
/load <model> Swap the live model mid-session
/permissions [mode] View or set permission mode
/chat [--resume ID] Re-initialise the agent (or resume a session)

Hardware + models

Command What it does
/scan Scan hardware and show which LLM sizes fit
/scan --verbose Include sizes that don't fit
/scan --json Machine-readable output
/models Browse real Ollama models with fit analysis
/models --fits-only Hide models that don't fit
/models --ollama Show which models are locally installed
/run [tag] Launch a model directly in Ollama
/suggest --use-case coding Personalised recommendation

System

Command What it does
/doctor Health check (hardware, Ollama, sandbox)
/memory show · /remember · /forget Persistent per-project memory
/sessions List resumable chat sessions
/skills List installed agent skills
/init Scaffold a TINILLM.md for this repo
/help List every command
/clear Clear the terminal
/exit Quit (Ctrl+D also works)

Tab-completion works on every slash command, subcommand, and flag.


Plan mode

│ /plan
plan mode ON — read-only. Describe your goal.

│ add a --json flag to the scan command
Plan:
1. read render/scan.py to find the current output path
2. add --json flag in commands.py scan_cmd
3. route to a JSON renderer, gated on the flag
4. add a test in tests/test_scan_cmd.py

plan auto-accepted — executing in workspace-write mode
⚙  read_file(path='tinillm/render/scan.py')
⚙  edit_file(path='tinillm/commands.py', …)
⚙  write_file(path='tinillm/render/scan_json.py', …)
done.

Once the plan is approved, destructive bash and write-tool calls run without confirmation prompts. Autonomy is scoped to the one execution phase — the next user turn gets a fresh confirm-on-destructive policy.


Permissions

Three modes, set per-session:

Mode Reads Workspace writes Shell Network
read-only
workspace-write (default) confirm on destructive
danger-full-access

Switch mid-session with /permissions read-only.


Sandbox

On Linux with bubblewrap installed, every bash call is wrapped with bwrap --unshare-net --bind <workspace> / --ro-bind /usr /usr … — reads are confined to the workspace, writes can't escape it, and network is off by default. sandbox_allow_network=true in .tinillmrc opens it.

macOS / Windows: sandbox is a no-op for now; permission enforcer and workspace resolution still apply.

Check status with /doctor or the welcome panel.


Configuration

Per-project: .tinillmrc (TOML) in the repo root:

model = "qwen2.5-coder:14b"
permission_mode = "workspace-write"
auto_accept_plan = true
sandbox_enabled = true
sandbox_allow_network = false
allowed_tools = ["read_file", "write_file", "edit_file", "bash"]
denied_tools = []

Per-user: ~/.tinillm/settings.json (written by /load).


GPU support

Vendor Detection method
NVIDIA nvidia-smi → sysfs fallback
AMD rocm-smi → sysfs fallback
Apple Silicon system_profiler (unified memory)
Intel Arc sysfs + lspci
Windows (all) PowerShell WMI
Any vulkaninfo last-resort fallback

Fit levels

Level Meaning
Perfect Fits comfortably at Q4_K_M or better with ≥20% headroom
Good Fits but tightly
Marginal Only at heavy compression / reduced context, or CPU-only
TooTight Won't fit under any quantisation

Versioning

Version Feature
2.4 Unified REPL · plan autonomy · sandbox surfacing ← current
2.3 Plan mode · auto-accept · session JSONL
2.2 Agent tool calling · three-tier permissions
2.1 Linux bubblewrap sandbox
1.9 Hardware scanning + model runner
1.8 Interactive REPL, slash commands
1.1 First release — hardware scanner

Part of the tini* family

Tool What it does
tiniRAG Privacy-first RAG CLI
tinillm Local LLM agent + hardware scanner

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinillm-2.6.0.tar.gz (131.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinillm-2.6.0-py3-none-any.whl (117.8 kB view details)

Uploaded Python 3

File details

Details for the file tinillm-2.6.0.tar.gz.

File metadata

  • Download URL: tinillm-2.6.0.tar.gz
  • Upload date:
  • Size: 131.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.6.0.tar.gz
Algorithm Hash digest
SHA256 0dd5e63cef29098939e59e3dff65c589c89d6eb50a010dd2dcff8282e87f44b8
MD5 7ab8542295769cb9d6807e0bfeb957c8
BLAKE2b-256 a59bd4673e4bf38077d36baba7a587c051cbcaec0cc4ea3919fc030068b42bd3

See more details on using hashes here.

File details

Details for the file tinillm-2.6.0-py3-none-any.whl.

File metadata

  • Download URL: tinillm-2.6.0-py3-none-any.whl
  • Upload date:
  • Size: 117.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinillm-2.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bde513006e4072b8cbc18bad828e2d18b0680e6007a1abc081db576bd500e523
MD5 abe60385976780bd1256e1f8cc7db5d6
BLAKE2b-256 cbe35f152d464febc8e1bebeac193624598af8df919ced6690fe4eb82fd76884

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page