Easiest way to access any AI model with a single subscription.
Project description
⚡ LitAI
Easiest way to access any AI model with a single subscription using Python.
Every AI model is better at some tasks than others, and we have to switch between them. This requires subscriptions to multiple LLM providers and is costly. LitAI lets you use any LLM provider (both proprietary and open-source) under a single subscription.
Easily switch between any AI model, save costs, and track usage through a unified dashboard.
✅ Access any AI model ✅ Usage dashboard ✅ Single subscription ✅ Bring your own model ✅ Easily switch across LLMs ✅ 20+ public models ✅ Track LLM token usage ✅ Easy setup ✅ No MLOps glue code
Lightning AI • Docs • Quick start
Quick Start
Install LitAI via pip (more options):
pip install litai
Run on a Studio
When running inside Lightning Studio, you can use any available LLM out of the box — no extra setup required.
from litai import LLM
llm = LLM(model="openai/gpt-4")
print(llm.chat("who are you?"))
# I'm an AI by OpenAI
Run locally (outside Studio)
To use LitAI outside of Lightning Studio, you'll need to explicitly provide your teamspace name.
The teamspace input format is: "owner-name/teamspace-name" (e.g. "username/my-team" or "org-name/team-name")
from litai import LLM
llm = LLM(model="openai/gpt-4", teamspace="owner-name/teamspace-name")
print(llm.chat("who are you?"))
# I'm an AI by OpenAI
Key benefits
A few key benefits:
- Supports 20+ public models
- Bring your own model
- Keeps chat logs
- Optional guardrails
- Usage dashboard
Features
✅ Concurrency with async
✅ Fallback and retry
✅ Switch models
✅ Multi-turn conversation logs
✅ Streaming
Advanced features
Concurrency with async
LitAI supports asynchronous execution, allowing you to handle multiple requests concurrently without blocking. This is especially useful in high-throughput applications like chatbots, APIs, or agent loops.
To enable async behavior, set enable_async=True when initializing the LLM class. Then use await llm.chat(...) inside an async function.
import asyncio
from litai import LLM
async def main():
llm = LLM(model="openai/gpt-4", teamspace="lightning-ai/litai", enable_async=True)
print(await llm.chat("who are you?"))
if __name__ == "__main__":
asyncio.run(main())
Streaming
Stream the model response as it's being generated.
from litai import LLM
llm = LLM(model="openai/gpt-4")
for chunk in llm.chat("hello", stream=True):
print(chunk, end="", flush=True)
Conversations
Keep chat history across multiple turns so the model remembers context. This is useful for assistants, summarizers, or research tools that need multi-turn chat history.
Each conversation is identified by a unique name. LitAI stores conversation history separately for each name.
from litai import LLM
llm = LLM(model="openai/gpt-4")
# Continue a conversation across multiple turns
llm.chat("What is Lightning AI?", conversation="intro")
llm.chat("What can it do?", conversation="intro")
print(llm.get_history("intro")) # View all messages from the 'intro' thread
llm.reset_conversation("intro") # Clear conversation history
Create multiple named conversations for different tasks.
from litai import LLM
llm = LLM(model="openai/gpt-4")
llm.chat("Summarize this text", conversation="summarizer")
llm.chat("What's a RAG pipeline?", conversation="research")
print(llm.list_conversations())
Switch models
Use the best model for each task. LitAI lets us dynamically switch models at request time.
We set a default model when initializing LLM and override it with the model parameter only when needed.
from litai import LLM
llm = LLM(model="openai/gpt-4")
# Uses the default model (openai/gpt-4)
print(llm.chat("Who created you?"))
# >> I am a large language model, trained by OpenAI.
# Override the default model for this request
print(llm.chat("Who created you?", model="google/gemini-2.5-flash"))
# >> I am a large language model, trained by Google.
# Uses the default model again
print(llm.chat("Who created you?"))
# >> I am a large language model, trained by OpenAI.
Fallbacks and retries
Ensure reliable responses even if a model is unavailable.
LitAI automatically retries requests and switches to fallback models in order.
- Fallback models are tried in the order provided.
- Each model gets up to
max_retriesattempts independently. - The first successful response is returned immediately.
- If all models fail after their retry limits, LitAI raises an error.
from litai import LLM
llm = LLM(
model="openai/gpt-4",
fallback_models=["google/gemini-2.5-flash", "anthropic/claude-3-5-sonnet-20240620"],
max_retries=4,
)
print(llm.chat("How do I fine-tune an LLM?"))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file litai-0.0.1.tar.gz.
File metadata
- Download URL: litai-0.0.1.tar.gz
- Upload date:
- Size: 19.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c585807f4b410d79b0bf230d407141319defc4e6bb20c5661f4a5d93214c4065
|
|
| MD5 |
fa3082b3102f978359b67d6ea0db6699
|
|
| BLAKE2b-256 |
e6bb4270ffcbcd44e967683fe0d48c77a30f0326a077c30af86fba236b54e2f0
|
File details
Details for the file litai-0.0.1-py3-none-any.whl.
File metadata
- Download URL: litai-0.0.1-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67994fb4538d663151f3f104a5f78365ef65bf80832e49c2088c0af95128d3ec
|
|
| MD5 |
fe32537cedd7473d98e09fd0caefa534
|
|
| BLAKE2b-256 |
7559619efb08b882151a0bfa11aa7894fe7b6ecec06401ded6a23d59817cf090
|