High performance LLM client
Project description
Bhumi (भूमि)
🌍 BHUMI - AI Client Setup and Usage Guide ⚡
Introduction
Bhumi (भूमि) is the Sanskrit word for Earth, symbolizing stability, grounding, and speed. Just as the Earth moves with unwavering momentum, Bhumi AI ensures that your inference speed is as fast as nature itself! 🚀
Bhumi is an open-source project designed to optimize and accelerate AI inference while maintaining simplicity, flexibility, and multi-model support. Whether you're working with OpenAI, Anthropic, or Gemini, Bhumi makes switching between providers seamless. Our Rust-based implementation is freely available for anyone to use, study, modify, and integrate into their own libraries - we encourage collaboration and improvements from the community!
💡 Note to AI Companies: Feel free to incorporate our performance optimizations into your official libraries! We just ask for appropriate attribution under our Apache 2.0 license.
1️⃣ Installation
To install Bhumi, run the following commands:
pip install bhumi
rm -rf target/wheels/*
pip uninstall bhumi
maturin develop
Note: Ensure you have Rust and Python installed before proceeding.
2️⃣ Environment Setup
Before running Bhumi, set up your API keys in your terminal:
export OPENAI_API_KEY="your-openai-key"
export GEMINI_API_KEY="your-gemini-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
3️⃣ Python Usage
Here's a basic example to get started with Bhumi:
import os
from bhumi import Bhumi
# Get API keys
OPENAI_KEY = os.environ['OPENAI_API_KEY']
GEMINI_KEY = os.environ['GEMINI_API_KEY']
ANTHROPIC_KEY = os.environ['ANTHROPIC_API_KEY']
# Example prompt
prompt = "Explain what a neural network is in one sentence."
# OpenAI example
openai_client = Bhumi(
max_concurrent=10,
provider="openai",
model="gpt-4o",
debug=True
)
openai_response = openai_client.completion(
model="openai/gpt-4o",
messages=[{"role": "user", "content": prompt}],
api_key=OPENAI_KEY
)
print("
🌟 OpenAI Response:", openai_response.text)
# Gemini example
gemini_client = Bhumi(
max_concurrent=10,
provider="gemini",
model="gemini-1.5-ultra",
debug=True
)
gemini_response = gemini_client.completion(
model="gemini/gemini-1.5-ultra",
messages=[{"role": "user", "content": prompt}],
api_key=GEMINI_KEY
)
print("
💡 Gemini Response:", gemini_response.text)
# Anthropic example
anthropic_client = Bhumi(
max_concurrent=10,
provider="anthropic",
model="claude-3-opus",
debug=True
)
anthropic_response = anthropic_client.completion(
model="anthropic/claude-3-opus",
messages=[{"role": "user", "content": prompt}],
api_key=ANTHROPIC_KEY
)
print("
🤖 Anthropic Response:", anthropic_response.text)
4️⃣ Supported Models
Bhumi supports ALL models from OpenAI, Anthropic, and Gemini, giving you full flexibility!
🔵 OpenAI
gpt-4o,gpt-4o-mini,gpt-4-turbo,gpt-3.5-turbo, and more!
🟠 Anthropic
claude-3-opus,claude-3-sonnet,claude-3-haiku, and more!
🟢 Gemini
gemini-1.5-ultra,gemini-1.5-pro,gemini-1.5-flash, and more!
❌ Current Limitations
- 🚫 No Tool Use: Bhumi does not currently support function calling or tool use.
- 🚫 No Streaming: Responses are returned in a single batch; streaming is not yet available.(TODO)
🎯 Why Use Bhumi?
✔ Open Source: Apache 2.0 licensed, free for commercial use
✔ Community Driven: Welcomes contributions from individuals and companies
✔ Blazing Fast: 2-3x faster than alternative solutions
✔ Resource Efficient: Uses 60% less memory than comparable clients
✔ Multi-Model Support: Easily switch between OpenAI, Anthropic, and Gemini
✔ Parallel Requests: Handles multiple concurrent requests effortlessly
✔ Flexibility: Debugging and customization options available
✔ Production Ready: Battle-tested in high-throughput environments
📊 Real-world Performance
In production environments, Bhumi has demonstrated:
- 125+ requests/second sustained throughput
- 99.9% uptime with automatic error handling
- Sub-second latency for most requests
- Minimal resource footprint even under heavy load
📊 Benchmark Results
Our latest benchmarks show significant performance advantages across different metrics:
⚡ Response Time
- LiteLLM: 13.79s
- Native: 5.55s
- Bhumi: 4.26s
- Google GenAI: 6.76s
🚀 Throughput (Requests/Second)
- LiteLLM: 3.48
- Native: 8.65
- Bhumi: 11.27
- Google GenAI: 7.10
💾 Peak Memory Usage (MB)
- LiteLLM: 275.9MB
- Native: 279.6MB
- Bhumi: 284.3MB
- Google GenAI: 284.8MB
These benchmarks demonstrate Bhumi's superior performance, particularly in throughput where it outperforms other solutions by up to 3.2x. While memory usage remains competitive, the significant gains in response time and throughput make Bhumi an excellent choice for high-performance applications.
🤝 Contributing
We welcome contributions from the community! Whether you're an individual developer or representing a company like Google, OpenAI, or Anthropic, feel free to:
- Submit pull requests
- Report issues
- Suggest improvements
- Share benchmarks
- Integrate our optimizations into your libraries (with attribution)
Check out our CONTRIBUTING.md for guidelines.
📜 License
Bhumi is proudly open source under the Apache 2.0 license. This means you can:
- Use it commercially
- Modify it
- Distribute it
- Use it privately
- Use it for patents
All we ask is that you provide appropriate attribution and include the license notice.
🌟 Join our community and help make AI inference faster for everyone! 🌟
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bhumi-0.1.2.tar.gz.
File metadata
- Download URL: bhumi-0.1.2.tar.gz
- Upload date:
- Size: 6.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
777df4f4bea1ea95153b1168e29a9d1e0e7b6a596295a19e0bcc5d6509c6454e
|
|
| MD5 |
551573b3e18e4fbf3314b25872704e3a
|
|
| BLAKE2b-256 |
0b1d3e88d45707342fa5ae787d54c15ae017c287a81f0e3d0e984cc5af91a40b
|
File details
Details for the file bhumi-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: bhumi-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a34b8bed92075595d14bd59bde44201a2fe58f43a281aa42825882bb71fe165e
|
|
| MD5 |
49411079a82962f56bdb9ad9e351e766
|
|
| BLAKE2b-256 |
b4b2191fa54c4c18c0e695a7ab4e4548350b2b2fa1cb75c6ef3db85dbe16be59
|
File details
Details for the file bhumi-0.1.2-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: bhumi-0.1.2-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed28b87bd169262cd237659f053afa0af5c1d3ba8dea10ad59c77bf42456d80f
|
|
| MD5 |
cafb48dd6704ae82bf4106968fba1f6b
|
|
| BLAKE2b-256 |
4819c90914becbe0d52f1d24f0976b0f1ded0fa4ded3b8cb4a692ceb64794f64
|