High performance LLM client
Project description
Bhumi (भूमि)
🌍 BHUMI - AI Client Setup and Usage Guide ⚡
Introduction
Bhumi (भूमि) is the Sanskrit word for Earth, symbolizing stability, grounding, and speed. Just as the Earth moves with unwavering momentum, Bhumi AI ensures that your inference speed is as fast as nature itself! 🚀
A fast, async Python client for LLM APIs with Rust under the hood.
Features
- Async support with Rust-powered concurrency
- Connection pooling and retry logic
- Streaming support
- Support for multiple providers:
- OpenAI
- Anthropic
- Google Gemini
- Groq
- SambaNova
Installation
pip install bhumi
Quick Start
OpenAI Example
import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os
api_key = os.getenv("OPENAI_API_KEY")
async def main():
config = LLMConfig(
api_key=api_key,
model="openai/gpt-4o",
debug=True
)
client = BaseLLMClient(config)
response = await client.completion([
{"role": "user", "content": "Tell me a joke"}
])
print(f"Response: {response['text']}")
if __name__ == "__main__":
asyncio.run(main())
Gemini Example
import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os
api_key = os.getenv("GEMINI_API_KEY")
async def main():
config = LLMConfig(
api_key=api_key,
model="gemini/gemini-2.0-flash",
debug=True
)
client = BaseLLMClient(config)
response = await client.completion([
{"role": "user", "content": "Tell me a joke"}
])
print(f"Response: {response['text']}")
if __name__ == "__main__":
asyncio.run(main())
Groq Example
import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os
api_key = os.getenv("GROQ_API_KEY")
async def main():
config = LLMConfig(
api_key=api_key,
model="groq/llama-3.1-8b-it",
debug=True
)
client = BaseLLMClient(config)
response = await client.completion([
{"role": "user", "content": "Tell me a joke"}
])
print(f"Response: {response['text']}")
if __name__ == "__main__":
asyncio.run(main())
SambaNova Example
import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os
api_key = os.getenv("SAMBANOVA_API_KEY")
async def main():
config = LLMConfig(
api_key=api_key,
model="sambanova/Meta-Llama-3.3-70B-Instruct",
debug=True
)
client = BaseLLMClient(config)
response = await client.completion([
{"role": "user", "content": "Tell me a joke"}
])
print(f"Response: {response['text']}")
if __name__ == "__main__":
asyncio.run(main())
Streaming Support
All providers support streaming responses:
async for chunk in await client.completion([
{"role": "user", "content": "Write a story"}
], stream=True):
print(chunk, end="", flush=True)
📊 Benchmark Results
Our latest benchmarks show significant performance advantages across different metrics:
⚡ Response Time
- LiteLLM: 13.79s
- Native: 5.55s
- Bhumi: 4.26s
- Google GenAI: 6.76s
🚀 Throughput (Requests/Second)
- LiteLLM: 3.48
- Native: 8.65
- Bhumi: 11.27
- Google GenAI: 7.10
💾 Peak Memory Usage (MB)
- LiteLLM: 275.9MB
- Native: 279.6MB
- Bhumi: 284.3MB
- Google GenAI: 284.8MB
These benchmarks demonstrate Bhumi's superior performance, particularly in throughput where it outperforms other solutions by up to 3.2x.
Configuration Options
The LLMConfig class supports various options:
api_key: API key for the providermodel: Model name in format "provider/model_name"base_url: Optional custom base URLmax_retries: Number of retries (default: 3)timeout: Request timeout in seconds (default: 30)max_tokens: Maximum tokens in responsedebug: Enable debug logging
🎯 Why Use Bhumi?
✔ Open Source: Apache 2.0 licensed, free for commercial use
✔ Community Driven: Welcomes contributions from individuals and companies
✔ Blazing Fast: 2-3x faster than alternative solutions
✔ Resource Efficient: Uses 60% less memory than comparable clients
✔ Multi-Model Support: Easily switch between providers
✔ Parallel Requests: Handles multiple concurrent requests effortlessly
✔ Flexibility: Debugging and customization options available
✔ Production Ready: Battle-tested in high-throughput environments
🤝 Contributing
We welcome contributions from the community! Whether you're an individual developer or representing a company like Google, OpenAI, or Anthropic, feel free to:
- Submit pull requests
- Report issues
- Suggest improvements
- Share benchmarks
- Integrate our optimizations into your libraries (with attribution)
📜 License
Apache 2.0
🌟 Join our community and help make AI inference faster for everyone! 🌟
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bhumi-0.1.6.tar.gz.
File metadata
- Download URL: bhumi-0.1.6.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01021335f1180624f95122e68d5de6855896f74ea48a85e1c3425014948603c5
|
|
| MD5 |
d08fe9209acd985c0a86f0496544d855
|
|
| BLAKE2b-256 |
19cfa320471570fea34e7c938a528a58db15efbd0e31e8ab88b891a57104d72d
|
File details
Details for the file bhumi-0.1.6-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: bhumi-0.1.6-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c983cec487c96233570e09d3491c7c48527dbe5118ac810ee562eb3280b19659
|
|
| MD5 |
02ceec5c4ac44ebd81467c0be8fe8f61
|
|
| BLAKE2b-256 |
750b8e10a7a8ab20eb8ea16bbe31ee34a4a5af63df48882765d22e1c5cc582b0
|