Professional Multimodal AI Engine for Onyx platform
Project description
💎 ONYX AI Gemma 4 Engine (E2B Edition)
A high-performance, professional FastAPI wrapper for Gemma Multimodal models with built-in 4-bit quantization and streaming support. Developed by ONYX (RUI Company).
🚀 Features
- Zero Config Integration: Deploy a multimodal AI server in seconds.
- Optimized Performance: Native 4-bit quantization using
bitsandbytesfor low VRAM/RAM usage. - Real-time Streaming: Built-in SSE (Server-Sent Events) for smooth, token-by-token generation.
- Hardware Friendly: Optimized for both GPU and high-performance CPU inference.
📦 Installation
Option 1: Via pip
You can install the engine directly from PyPI:
pip install onyx-AI-Gemma4
Option 2: Via requirements.txt
For production environments or Hugging Face Spaces, use the following dependencies:
Plaintext
fastapi
uvicorn
transformers>=4.48.0
torch
accelerate
bitsandbytes
Pillow
torchvision
onyx-AI-Gemma4
💻 Python Usage Guide
To use the library, import the engine using the specific module name ONYXAI_Gemma4E2B. The engine handles model loading, quantization, and server routing automatically.
Standard Script Usage
Python
from ONYXAI_Gemma4E2B import OnyxEngine
# 1. Initialize the Engine
engine = OnyxEngine(model_id="google/gemma-4-E2B-it")
# 2. Run the server
if __name__ == "__main__":
engine.run(host="0.0.0.0", port=7860)
Production/Space Usage (main.py)
Ideal for Hugging Face Spaces or deployments using an external runner:
Python
from ONYXAI_Gemma4E2B import OnyxEngine
import uvicorn
import os
# Initialize the Engine
engine = OnyxEngine(model_id="google/gemma-4-E2B-it")
# Access the underlying FastAPI app instance
app = engine.app
@app.get("/")
def home():
return {"message": "ONYX Engine is running on Hugging Face Spaces!"}
if __name__ == "__main__":
port = int(os.environ.get("PORT", 7860))
uvicorn.run(app, host="0.0.0.0", port=port)
🛠 API Interaction
The endpoint /predict is built into the library and supports full streaming.
Endpoint: POST /predict
Example Request (JSON):
JSON
{
"messages": [
{
"role": "user",
"content": "Explain the importance of AI in modern software engineering."
}
],
"temperature": 0.7,
"max_tokens": 1024
}
🔗 Project Links
Organization: ONYX / RUI Company
Portfolio: Eng. Rawan Jassim
LinkedIn: Professional Profile [cite: 2025-08-11]
© 2026 ONYX. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
onyx_ai_gemma4-0.1.4.tar.gz
(3.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file onyx_ai_gemma4-0.1.4.tar.gz.
File metadata
- Download URL: onyx_ai_gemma4-0.1.4.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ed1127ebf1f87ae57d8e628c5af4e18f1585fe3511b5e54e09b7195972a33e8
|
|
| MD5 |
e1105bb6f3e5e6b13da6fc014a8519bc
|
|
| BLAKE2b-256 |
d667d73b76534f47dac15a77042dc8ed35e52cc49edf59314f6e3b81e4a39afc
|
File details
Details for the file onyx_ai_gemma4-0.1.4-py3-none-any.whl.
File metadata
- Download URL: onyx_ai_gemma4-0.1.4-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f93e57af3a2476f419f1e422b8b1c37a0fd96278a2ab396e53dc39a9135d22f7
|
|
| MD5 |
f33cd8a0fcc82ff1862b19e2771b2590
|
|
| BLAKE2b-256 |
a0fdc5e627645b34253bfbb2efcb4fe5335bab98985afdf142fec655285c7025
|