Skip to main content

Minimum Viable Language Model — find the smallest LLM that works for your task

Project description

smollest logo
Quickly find the smollest viable language model for your task, for faster and cheaper intelligence

The basic idea is to run your OpenAI/Anthropic API queries to other, smaller models on Hugging Face API (or local), allowing you to quickly find the smollest/cheapest/fastest model that would work for your use case.

smollest dashboard screenshot

Install

pip install smollest[openai]       # for OpenAI
pip install smollest[anthropic]    # for Anthropic
pip install smollest[all]          # both

Usage

Install openai from smollest and then write your code as normal!

from smollest import openai

client = openai.OpenAI(
    api_key="sk-...",
    project="my-classifier",  # organizes results by project
)

# By default, replays to 3 models of different sizes on HF Inference API
result = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Classify as positive/negative: I love this!"}],
)

Override candidates per-client or per-call:

# Per-client
client = openai.OpenAI(
    candidates=["mistralai/Mistral-7B-Instruct-v0.3", "http://localhost:1234/v1"],
)

# Per-call
result = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    candidates=["microsoft/Phi-3.5-mini-instruct"],
)

Works the same way with Anthropic:

from smollest import anthropic

client = anthropic.Anthropic(project="my-classifier")
result = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Classify: I love this!"}],
)

How it works

  1. Your API call goes to the baseline model as normal
  2. The same prompt is replayed to each candidate (HuggingFace serverless or local OpenAI-compatible server)
  3. Structured outputs (JSON) are compared field-by-field via exact match
  4. Results are printed to console and logged to ~/.smollest/

Remote candidates run in parallel; local candidates run sequentially.

Dashboard

smollest show

Opens a web dashboard with projects in the sidebar, a results table with truncation for long outputs, latency and cost per model, and aggregate match rates. The image above shows the UI, which you can reproduce by cloning this repo and running: python examples/demo_dashboard.py

Roadmap

  • Allow adding additional models directly through the UI
  • Add LLM as judge to score outputs that are not structured
  • Let developers eaisly fine tune models on outputs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smollest-0.2.0.tar.gz (784.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smollest-0.2.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file smollest-0.2.0.tar.gz.

File metadata

  • Download URL: smollest-0.2.0.tar.gz
  • Upload date:
  • Size: 784.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smollest-0.2.0.tar.gz
Algorithm Hash digest
SHA256 cdc251df3b0e57c991d300572a66c0ae88e69efa225ff4bcbdc055eb9b639046
MD5 901aac5f8cf8f24e8f34639376f29699
BLAKE2b-256 4fabe02a67aba918d51291c8f2bb7eca210a06eddaea74f9786d4e84ae7a0119

See more details on using hashes here.

Provenance

The following attestation bundles were made for smollest-0.2.0.tar.gz:

Publisher: publish.yml on abidlabs/mvlm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file smollest-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: smollest-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smollest-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 21a944d63f3e6adae2ffcc85c3c835590bbbc2846210b5a1e8a80ad1fea877b4
MD5 97f2b3141cb2721d3d4eb5b19589e7ef
BLAKE2b-256 4785f01af5a7c57a058f9ec1ef303e566bc99d7931c699d5cb6add715fb921d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for smollest-0.2.0-py3-none-any.whl:

Publisher: publish.yml on abidlabs/mvlm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page