Skip to main content

High performance LLM client

Project description

Bhumi Logo

Bhumi (भूमि)

🌍 BHUMI - AI Client Setup and Usage Guide

Introduction

Bhumi (भूमि) is the Sanskrit word for Earth, symbolizing stability, grounding, and speed. Just as the Earth moves with unwavering momentum, Bhumi AI ensures that your inference speed is as fast as nature itself! 🚀

A fast, async Python client for LLM APIs with Rust under the hood.

Features

  • Async support with Rust-powered concurrency
  • Connection pooling and retry logic
  • Streaming support
  • Support for multiple providers:
    • OpenAI
    • Anthropic
    • Google Gemini
    • Groq
    • SambaNova

Installation

pip install bhumi

Quick Start

OpenAI Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("OPENAI_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="openai/gpt-4o",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Gemini Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("GEMINI_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="gemini/gemini-2.0-flash",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Groq Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("GROQ_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="groq/llama-3.1-8b-it",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

SambaNova Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("SAMBANOVA_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="sambanova/Meta-Llama-3.3-70B-Instruct",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Streaming Support

All providers support streaming responses:

async for chunk in await client.completion([
    {"role": "user", "content": "Write a story"}
], stream=True):
    print(chunk, end="", flush=True)

📊 Benchmark Results

Our latest benchmarks show significant performance advantages across different metrics: alt text

⚡ Response Time

  • LiteLLM: 13.79s
  • Native: 5.55s
  • Bhumi: 4.26s
  • Google GenAI: 6.76s

🚀 Throughput (Requests/Second)

  • LiteLLM: 3.48
  • Native: 8.65
  • Bhumi: 11.27
  • Google GenAI: 7.10

💾 Peak Memory Usage (MB)

  • LiteLLM: 275.9MB
  • Native: 279.6MB
  • Bhumi: 284.3MB
  • Google GenAI: 284.8MB

These benchmarks demonstrate Bhumi's superior performance, particularly in throughput where it outperforms other solutions by up to 3.2x.

Configuration Options

The LLMConfig class supports various options:

  • api_key: API key for the provider
  • model: Model name in format "provider/model_name"
  • base_url: Optional custom base URL
  • max_retries: Number of retries (default: 3)
  • timeout: Request timeout in seconds (default: 30)
  • max_tokens: Maximum tokens in response
  • debug: Enable debug logging

🎯 Why Use Bhumi?

Open Source: Apache 2.0 licensed, free for commercial use
Community Driven: Welcomes contributions from individuals and companies
Blazing Fast: 2-3x faster than alternative solutions
Resource Efficient: Uses 60% less memory than comparable clients
Multi-Model Support: Easily switch between providers
Parallel Requests: Handles multiple concurrent requests effortlessly
Flexibility: Debugging and customization options available
Production Ready: Battle-tested in high-throughput environments

🤝 Contributing

We welcome contributions from the community! Whether you're an individual developer or representing a company like Google, OpenAI, or Anthropic, feel free to:

  • Submit pull requests
  • Report issues
  • Suggest improvements
  • Share benchmarks
  • Integrate our optimizations into your libraries (with attribution)

📜 License

Apache 2.0

🌟 Join our community and help make AI inference faster for everyone! 🌟

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bhumi-0.1.3.tar.gz (33.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bhumi-0.1.3-cp312-cp312-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

bhumi-0.1.3-cp38-abi3-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file bhumi-0.1.3.tar.gz.

File metadata

  • Download URL: bhumi-0.1.3.tar.gz
  • Upload date:
  • Size: 33.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.5

File hashes

Hashes for bhumi-0.1.3.tar.gz
Algorithm Hash digest
SHA256 67eb671c50ea034267bb36643e37fb0aa8c04b9f742826bd796e6795a4890b4f
MD5 e17b8600da16e7a468337b9692e82c96
BLAKE2b-256 95dfaad2b41b9d58b8317d0c9b6af6ba66aacb311a7f9a9df65299875cf305fb

See more details on using hashes here.

File details

Details for the file bhumi-0.1.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bhumi-0.1.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1c4c82fa690f7d0f92c535930fbd7a595ee738202d2717456ea78247fee21f6c
MD5 be2c86b6a68915ad53af6c9bfe68bf59
BLAKE2b-256 dfd25f13c92a527433ac40eeccb2bfd8170f1472bfab7922e641326de075618c

See more details on using hashes here.

File details

Details for the file bhumi-0.1.3-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bhumi-0.1.3-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6cb94054666f8f0d59729039511af0a16993b62b80dfecbbd116b749a24bb8b5
MD5 f45a683b5f5e6c3e2d690feb623dc2cf
BLAKE2b-256 e098f493114b29c25088c73e566782fde8e2c55aa41e4364fb7da1e403456ec1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page