A thin, unified LLM abstraction layer. Call any LLM with a single API.

These details have not been verified by PyPI

Project links

Project description

anyllm

Local-first LLM abstraction — one API for Ollama, llama.cpp, OpenAI, Anthropic, and HuggingFace.

PyPI Python License

anyllm is a lightweight abstraction layer over the most popular LLM providers. Unlike heavier alternatives, it is local-first: if Ollama is running on your machine, anyllm.chat("hello") just works — no API keys, no cloud. It also supports llama.cpp, OpenAI, Anthropic, and HuggingFace Transformers behind the same tiny API, with first-class support for tool/function calling, streaming, structured JSON outputs, multi-modal inputs, embeddings, and conversation memory.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyllm?

One-liner API — anyllm.chat("Hello") auto-detects your best local provider
Plugin architecture — Add custom providers via @register_provider
Local-first — Defaults to Ollama if available, no API key required
Minimal core deps — Only httpx and pydantic; every provider is optional
Production-ready — Streaming, async, tool-calling, retries, structured outputs

Installation

pip install anyllm

For optional providers:

pip install anyllm[openai]          # OpenAI GPT-4, GPT-3.5
pip install anyllm[anthropic]       # Claude 3.5 Sonnet / Opus / Haiku
pip install anyllm[llamacpp]        # llama.cpp local quantized models
pip install anyllm[transformers]    # HuggingFace Transformers (local)
pip install anyllm[all]             # everything

Ollama needs no Python package — just have it running at http://localhost:11434.

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyllm

# 1. Simple chat (auto-selects Ollama if running, else first configured provider)
reply = anyllm.chat("Explain RAG in one sentence.")
print(reply)

# 2. Specify a provider + model explicitly
reply = anyllm.chat(
    "What is the capital of France?",
    provider="ollama",
    model="llama3.1:8b",
)

# 3. Streaming (yields tokens as they are generated)
for chunk in anyllm.stream("Write a haiku about Python"):
    print(chunk, end="", flush=True)

# 4. Structured output (JSON mode — validates against a Pydantic model)
from pydantic import BaseModel
class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

recipe = anyllm.chat("Give me a pasta recipe", response_model=Recipe)
print(recipe.name, recipe.ingredients)

Models & Methods

Providers (local-first priority)

Priority	Provider	How it works	Install
1	Ollama	HTTP client to `http://localhost:11434` (default if reachable)	built-in
2	llama.cpp	Loads GGUF models via `llama-cpp-python`	`anyllm[llamacpp]`
3	OpenAI	REST API (`gpt-4o`, `gpt-4o-mini`, `gpt-3.5-turbo`)	`anyllm[openai]`
4	Anthropic	REST API (`claude-3-5-sonnet`, `claude-3-5-haiku`, `claude-3-opus`)	`anyllm[anthropic]`
5	HuggingFace Transformers	Loads any HF causal-LM model locally	`anyllm[transformers]`

Provider priority can be overridden via anyllm.set_priority([...]) or per-call with provider="...".

Features

Tool / function calling — Pass Python functions; parameter schemas are auto-extracted from type hints and docstrings. Dispatches to Ollama tools, OpenAI tools, or Anthropic tool use automatically.
Streaming — Unified token streaming for every provider (yields strings).
Async — anyllm.achat(...), anyllm.astream(...).
Structured outputs — response_model=MyPydanticModel uses native JSON mode on OpenAI/Anthropic/Ollama, falls back to regex extraction + retries elsewhere.
Multi-modal — Pass images via anyllm.chat([..., {"image": "cat.jpg"}], model="gpt-4o").
Embeddings — anyllm.embed("text", model="nomic-embed-text") with Ollama / OpenAI / sentence-transformers.
Conversation memory — Conversation() with sliding-window history and optional disk persistence.
Retries + timeouts — Configurable exponential backoff on transient errors.

API Reference

Function	Purpose
`anyllm.chat(messages, **opts)`	Chat completion -> `str` or `Pydantic` model
`anyllm.stream(messages, **opts)`	Generator yielding token chunks
`anyllm.achat / astream`	Async variants
`anyllm.embed(text, model=...)`	Returns `list[float]` embedding
`anyllm.tools(fns, prompt)`	Tool-calling loop with auto-dispatch
`anyllm.Conversation(system=...)`	Multi-turn memory
`anyllm.list_models(provider=...)`	Enumerate available models
`anyllm.register_provider(name, cls)`	Add a custom provider

CLI Usage

anyllm chat "Summarize this file" --file notes.txt
anyllm chat "Hi" --provider ollama --model llama3.1:8b
anyllm stream "Write a poem"
anyllm embed "hello world" --model nomic-embed-text
anyllm list-models --provider ollama

Examples

Tool calling with auto-extracted schemas

import anyllm

def get_weather(city: str, units: str = "celsius") -> dict:
    """Get the current weather for a city."""
    # ... call a weather API ...
    return {"city": city, "temp": 22, "units": units}

# anyllm inspects the signature + docstring, builds the JSON schema,
# runs the LLM, dispatches the tool call, and returns the final reply.
reply = anyllm.tools([get_weather], "What's the weather in Hanoi?")
print(reply)

Multi-turn conversation with memory

from anyllm import Conversation

conv = Conversation(system="You are a helpful Python tutor.", model="llama3.1:8b")
conv.send("What is a decorator?")
conv.send("Show me an example")          # remembers previous context
conv.save("chat.json")                   # persist to disk

Vision input with a multi-modal model

import anyllm

reply = anyllm.chat(
    [{"text": "What's in this image?"}, {"image": "cat.jpg"}],
    provider="openai",
    model="gpt-4o",
)

License

MIT (c) Viet-Anh Nguyen

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Apr 9, 2026

This version

0.2.3

Apr 9, 2026

0.2.2

Apr 9, 2026

0.2.1

Apr 9, 2026

0.2.0

Apr 9, 2026

0.0.26

Jun 11, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyllm-0.2.3.tar.gz (40.8 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anyllm-0.2.3-py3-none-any.whl (33.7 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file anyllm-0.2.3.tar.gz.

File metadata

Download URL: anyllm-0.2.3.tar.gz
Upload date: Apr 9, 2026
Size: 40.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyllm-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`b104ef9a0a5dced264a72d4db7186d231e44b108aab5b4469dcd69318be8d3db`
MD5	`d4226541076558e74825cc98a006f773`
BLAKE2b-256	`f105d7d7f2a1099ae339e6f96704f69691063c8953691f570219a6b12287168c`

See more details on using hashes here.

File details

Details for the file anyllm-0.2.3-py3-none-any.whl.

File metadata

Download URL: anyllm-0.2.3-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 33.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyllm-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8b8d49ecc994730328c32ad29db9a46576b27bef6c93c324c7d0a230d7923d81`
MD5	`8b04c0e626b918bdf50eda025aed446d`
BLAKE2b-256	`23d785f6392f38399dec1ffe71f6b1d56e6988454eff3a9d445927ca40a858c0`

See more details on using hashes here.

anyllm 0.2.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

anyllm

Why anyllm?

Installation

Quick Start

Models & Methods

Providers (local-first priority)

Features

API Reference

CLI Usage

Examples

Tool calling with auto-extracted schemas

Multi-turn conversation with memory

Vision input with a multi-modal model

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes