Skip to main content

A very simple LLM manager for Python.

Project description

L2M2: A Simple Python LLM Manager 💬👍

CI codecov PyPI version

L2M2 ("LLM Manager" → "LLMM" → "L2M2") is a tiny and very simple LLM manager for Python that exposes lots of models through a unified API.

Advantages

  • Simple: Completely unified interface – just swap out the model name.
  • Tiny: Only one external dependency (aiohttp). No BS dependency graph.
  • Private: Compatible with self-hosted models on your own infrastructure.
  • Fast: Fully asynchronous and non-blocking if concurrent calls are needed.

Features

  • 70+ regularly updated supported models from popular hosted providers.
  • Support for self-hosted models via Ollama.
  • Manageable chat memory – even across multiple models or with concurrent memory streams.
  • JSON mode
  • Prompt loading tools

Supported API-based Models

L2M2 supports 71 models from OpenAI, Google, Anthropic, Cohere, Mistral, Groq, Replicate, Cerebras, and Moonshot AI. The full list of supported models can be found here.

Usage (Full Docs)

Requirements

  • Python >= 3.10
  • At least one valid API key for a supported provider, or a working Ollama installation (their docs).

Installation

pip install l2m2

Environment Setup

If you plan to use an API-based model, at least one of the following environment variables is set in order for L2M2 to automatically activate the provider.

Provider Environment Variable
OpenAI OPENAI_API_KEY
Anthropic ANTHROPIC_API_KEY
Cohere CO_API_KEY
Google GOOGLE_API_KEY
Groq GROQ_API_KEY
Replicate REPLICATE_API_TOKEN
Mistral (La Plateforme) MISTRAL_API_KEY
Cerebras CEREBRAS_API_KEY
Moonshot AI MOONSHOT_API_KEY

Otherwise, ensure Ollama is running – by default L2M2 looks for it at http://localhost:11434, but this can be configured.

Basic Usage

from l2m2.client import LLMClient

client = LLMClient()

response = client.call(model="gpt-5", prompt="Hello world")
print(response)

For the full usage guide, including memory, asynchronous usage, local models, JSON mode, and more, see Usage Guide.

Planned Features

  • Streaming responses
  • Support for AWS Bedrock, Azure OpenAI, and Google Vertex APIs.
  • Support for structured outputs where available (OpenAI, Google, Cohere, Groq, Mistral, Cerebras)
  • Response format customization: i.e., token use, cost, etc.
  • Support other self-hosted providers (vLLM and GPT4all) outside of Ollama
  • Support for batch APIs where available (OpenAI, Anthropic, Google, Groq, Mistral)
  • Support for embeddings as well as inference
  • Port this project over to TypeScript
  • ...etc.

Contributing

Contributions are welcome! Please see the below contribution guide.

  • Requirements
    • Python versions 3.10 through 3.14
    • uv >= 0.9.2
    • GNU Make
  • Setup
    • Clone this repository and create a Python virtual environment.
    • Install dependencies: make init.
    • Create a feature branch and an issue with a description of the feature or bug fix.
  • Develop
    • Run lint, typecheck and tests: make (make lint, make type, and make test can also be run individually).
    • Generate test coverage: make coverage.
    • If you've updated the supported models, run make update-docs to reflect those changes in the README.
    • Make sure to run make tox regularly to backtest your changes back to 10.0 (you'll need to have all versions of Python between 3.10 and 3.14 installed to do this locally. If you don't, this project's CI will still be able to backtest on all of these versions once you push your changes).
  • Integration Test
    • Create a .env file at the project root with your API keys for all of the supported providers (OPENAI_API_KEY, etc.).
    • Integration test your local changes by running make itl ("integration test local").
    • Once your changes are ready to build, run make build (make sure you uninstall any existing distributions).
    • Run the integration tests against the distribution with make itest.
  • Contribute
    • Create a PR and ping me for a review.
    • Merge!

Contact

If you have requests, suggestions, or any other questions about l2m2 please shoot me a note at pierce@kelaita.com, open an issue on Github, or DM me on Slack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

l2m2-0.0.62.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

l2m2-0.0.62-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file l2m2-0.0.62.tar.gz.

File metadata

  • Download URL: l2m2-0.0.62.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for l2m2-0.0.62.tar.gz
Algorithm Hash digest
SHA256 4ff8b1f3cfa26f038a3053b85caf01bd3e59d5de00030183d355a6270027de99
MD5 29d22415b74847852934e3049e3625af
BLAKE2b-256 7d3489b08da49efb501000944d71f49e5fee7ea31e6423f1b96bc9e397420194

See more details on using hashes here.

File details

Details for the file l2m2-0.0.62-py3-none-any.whl.

File metadata

  • Download URL: l2m2-0.0.62-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for l2m2-0.0.62-py3-none-any.whl
Algorithm Hash digest
SHA256 0bb809eb05e6829850057cf78da01aa935c8b91c9b693998a02ed2b386754a8c
MD5 6ade8d3a399b29684da76001aa7c2bd0
BLAKE2b-256 2a4a014ba6fa0e99b90d676a328d250f0b1a85bb4a1244a17131fe9f80392b6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page