Skip to main content

High performance LLM client

Project description

Bhumi Logo

Bhumi (भूमि)

🌍 BHUMI - AI Client Setup and Usage Guide

Introduction

Bhumi (भूमि) is the Sanskrit word for Earth, symbolizing stability, grounding, and speed. Just as the Earth moves with unwavering momentum, Bhumi AI ensures that your inference speed is as fast as nature itself! 🚀

A fast, async Python client for LLM APIs with Rust under the hood.

Features

  • Async support with Rust-powered concurrency
  • Connection pooling and retry logic
  • Streaming support
  • Support for multiple providers:
    • OpenAI
    • Anthropic
    • Google Gemini
    • Groq
    • SambaNova

Installation

pip install bhumi

Quick Start

OpenAI Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("OPENAI_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="openai/gpt-4o",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Gemini Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("GEMINI_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="gemini/gemini-2.0-flash",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Groq Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("GROQ_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="groq/llama-3.1-8b-it",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

SambaNova Example

import asyncio
from bhumi.base_client import BaseLLMClient, LLMConfig
import os

api_key = os.getenv("SAMBANOVA_API_KEY")

async def main():
    config = LLMConfig(
        api_key=api_key,
        model="sambanova/Meta-Llama-3.3-70B-Instruct",
        debug=True
    )
    
    client = BaseLLMClient(config)
    
    response = await client.completion([
        {"role": "user", "content": "Tell me a joke"}
    ])
    print(f"Response: {response['text']}")

if __name__ == "__main__":
    asyncio.run(main())

Streaming Support

All providers support streaming responses:

async for chunk in await client.completion([
    {"role": "user", "content": "Write a story"}
], stream=True):
    print(chunk, end="", flush=True)

📊 Benchmark Results

Our latest benchmarks show significant performance advantages across different metrics: alt text

⚡ Response Time

  • LiteLLM: 13.79s
  • Native: 5.55s
  • Bhumi: 4.26s
  • Google GenAI: 6.76s

🚀 Throughput (Requests/Second)

  • LiteLLM: 3.48
  • Native: 8.65
  • Bhumi: 11.27
  • Google GenAI: 7.10

💾 Peak Memory Usage (MB)

  • LiteLLM: 275.9MB
  • Native: 279.6MB
  • Bhumi: 284.3MB
  • Google GenAI: 284.8MB

These benchmarks demonstrate Bhumi's superior performance, particularly in throughput where it outperforms other solutions by up to 3.2x.

Configuration Options

The LLMConfig class supports various options:

  • api_key: API key for the provider
  • model: Model name in format "provider/model_name"
  • base_url: Optional custom base URL
  • max_retries: Number of retries (default: 3)
  • timeout: Request timeout in seconds (default: 30)
  • max_tokens: Maximum tokens in response
  • debug: Enable debug logging

🎯 Why Use Bhumi?

Open Source: Apache 2.0 licensed, free for commercial use
Community Driven: Welcomes contributions from individuals and companies
Blazing Fast: 2-3x faster than alternative solutions
Resource Efficient: Uses 60% less memory than comparable clients
Multi-Model Support: Easily switch between providers
Parallel Requests: Handles multiple concurrent requests effortlessly
Flexibility: Debugging and customization options available
Production Ready: Battle-tested in high-throughput environments

🤝 Contributing

We welcome contributions from the community! Whether you're an individual developer or representing a company like Google, OpenAI, or Anthropic, feel free to:

  • Submit pull requests
  • Report issues
  • Suggest improvements
  • Share benchmarks
  • Integrate our optimizations into your libraries (with attribution)

📜 License

Apache 2.0

🌟 Join our community and help make AI inference faster for everyone! 🌟

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bhumi-0.1.4.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bhumi-0.1.4-cp38-abi3-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file bhumi-0.1.4.tar.gz.

File metadata

  • Download URL: bhumi-0.1.4.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.5

File hashes

Hashes for bhumi-0.1.4.tar.gz
Algorithm Hash digest
SHA256 940ae6f21308d5eada5a315f7d83141fd28070a2811831532e303649139a0d26
MD5 5b03c74b6c4e2294e5d80586e5dbf247
BLAKE2b-256 003cfd95ab73b920bcd907fceea65e89007b446a33eccf5ff66b22565b32778b

See more details on using hashes here.

File details

Details for the file bhumi-0.1.4-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: bhumi-0.1.4-cp38-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.8+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.5

File hashes

Hashes for bhumi-0.1.4-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 13aebe12257be227c1f6d16ec64a9bc2398c04da85015a7599f3f883f3c9502b
MD5 64a0f971cc9ae7bfcf297a6921d5bd71
BLAKE2b-256 753255b40cae9f3f4a0565d6cf97c692538746c2151c17af433fdb3a88ecde2d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page