Skip to main content

Scan your hardware and find compatible Ollama LLMs

Project description

ollama-scout

Python 3.10+ PyPI License: MIT Platform CI

One command to find the right LLMs for your hardware.

ollama-scout scans your GPU VRAM, CPU, and RAM, then recommends compatible Ollama models grouped by use case — with an interactive guided mode for new users and full CLI flags for power users.


Demo

$ ollama-scout

  ollama-scout v0.2.0  |  LLM Hardware Advisor  (Interactive Mode)

  Welcome! Let's find the best LLMs for your hardware.
  Press Enter to scan your system, or Ctrl+C to exit.

                   System Hardware
  ┌─────────────────┬────────────────────────────────────────┐
  │ OS              │ Linux                                  │
  │ CPU             │ AMD Ryzen 9 5900X                      │
  │ Cores / Threads │ 12 cores / 24 threads                  │
  │ RAM             │ 32.0 GB                                │
  │ GPU             │ NVIDIA RTX 3080 (10.0 GB VRAM)         │
  └─────────────────┴────────────────────────────────────────┘

  What are you mainly using this for?
    1. All categories    2. Coding    3. Reasoning    4. Chat

  Coding Models
  Model             Tag    Quant    Size    Fit        Mode     Status
  deepseek-coder    6.7b   Q4_K_M   3.8GB   Excellent  GPU      Available
  codellama         7b     Q4_K_M   3.8GB   Excellent  GPU      Pulled

  Would you like to compare two models? [y/N]:
  Save results as a Markdown report? [y/N]:

Quick Start

# Recommended
pipx install ollama-scout

# Or with pip
pip install ollama-scout

# Or from source
git clone https://github.com/sandy-sp/ollama-scout.git
cd ollama-scout && pip install -e .

Then run:

ollama-scout          # interactive guided mode — no flags needed

Requires Python 3.10+ and Ollama installed.


Features

Feature Description
Interactive guided mode Step-by-step session with no flags needed (ollama-scout or -i)
Hardware detection NVIDIA, AMD ROCm, Apple Silicon unified memory, multi-GPU
Live + offline models Fetches from Ollama API with 24hr cache; --offline uses built-in list
Smart scoring GPU > Multi-GPU > CPU+GPU offload > CPU-only, with time estimates
Use-case grouping Coding, Reasoning, Chat with per-category tables
Real benchmark timing Measures actual tokens/sec on pulled models via ollama run
Model comparison --compare model1 model2 for side-by-side analysis with verdict
Model detail view --model NAME shows all variants scored against your hardware
Markdown export --export saves a formatted report
Persistent config XDG-compliant paths with --config and --config-set
Auto-pull Pull recommended models interactively or via --pull

CLI Reference

Flag Description Example
-i, --interactive Launch guided mode (default with no args) ollama-scout -i
--use-case Filter by category --use-case coding
--flat Flat list instead of grouped tables --flat
--top N Limit number of results --top 20
--offline Use built-in fallback model list --offline
--benchmark Show inference speed estimates --benchmark
--model NAME Detail view for a specific model --model deepseek-coder
--compare M1 M2 Side-by-side model comparison --compare llama3.2 mistral
--export Auto-export Markdown report --export
--output PATH Export to a specific file --output ~/report.md
--pull MODEL Pull a model via ollama --pull llama3.2:3b
--update-models Force-refresh model list cache --update-models
--config Show current configuration --config
--config-set K=V Set a config value --config-set offline_mode=true
--no-pull-prompt Skip interactive pull prompt --no-pull-prompt
--version Show version --version

Run ollama-scout --help for the full list. See docs/USAGE.md for detailed examples.


How Scoring Works

Fit Mode Meaning
Excellent GPU Model fits fully in single GPU VRAM
Excellent Multi-GPU Model distributed across multiple GPUs
Good CPU+GPU Partially offloaded to RAM — usable but slower
Possible CPU CPU-only inference with time estimate
(excluded) Model too large for available memory

Apple Silicon: unified memory is treated as VRAM, so M1/M2/M3/M4 Macs get GPU-tier scoring with a 4GB system reserve.


Roadmap

  • Interactive guided mode
  • Real benchmark timing (ollama run)
  • Model comparison mode (--compare)
  • Multi-GPU support
  • XDG-compliant config paths
  • Model list caching (24hr TTL)
  • pip installable on PyPI
  • Live streaming benchmarks with progress bar
  • Model search and filter by keyword
  • Web UI version
  • Config profiles (work / gaming / minimal)

Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines.

License

MIT — see LICENSE.


Generated by ollama-scout

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_scout-0.2.1.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama_scout-0.2.1-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file ollama_scout-0.2.1.tar.gz.

File metadata

  • Download URL: ollama_scout-0.2.1.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollama_scout-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1da595cb72de332325974198b2f5f2ffd3300f440687717d12cad57ccc97039f
MD5 af2cd33b250515362953ee3d1d76a8e0
BLAKE2b-256 f4c15cdd7d7bd0b3152cbd27ddeecac7aa7ba204e56619abbd0f10d97aa9bbc0

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama_scout-0.2.1.tar.gz:

Publisher: release.yml on sandy-sp/ollama-scout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ollama_scout-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ollama_scout-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollama_scout-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 123668060758f2234f73cf7982703b3b505ef7041ed2215d2c64c021a822d59b
MD5 3a52d63574f2cdfc250392a19659629f
BLAKE2b-256 4c1299c9f2d146ad5834bbe92a3066b31334179f6256cdfb823819c516297836

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama_scout-0.2.1-py3-none-any.whl:

Publisher: release.yml on sandy-sp/ollama-scout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page