Nthuku-Fast: A blazing-fast multimodal AI model with vision and language understanding
Project description
Nthuku-Fast
Efficient Multimodal Vision-Language Model with Mixture of Experts (MoE) Architecture
Features
✨ High Performance
- Flash Attention for 2-4x speedup
- Extended 8K context window (32x larger)
- Optimized MoE routing (20-30% faster)
💰 Cost Effective
- Prompt caching (10x cost reduction)
- ~8B active parameters (efficient)
- 90%+ cache hit rates
🧠 Advanced Capabilities
- Vision understanding
- Text generation
- Speculative decoding (2-3x faster)
- Thinking traces / chain-of-thought
Installation
From PyPI (once published)
pip install nthuku-fast
From source
git clone https://github.com/elijahnzeli1/Nthuku-fast_v2.git
cd Nthuku-fast_v2/nthuku-fast-package
pip install -e .
Local installation (development)
cd nthuku-fast-package
pip install -e .
Quick Start
from nthuku_fast import create_nthuku_fast_model
import torch
# Create model (all optimizations enabled by default)
model = create_nthuku_fast_model(
hidden_dim=512,
num_experts=8,
top_k_experts=2
)
# Or use presets for different sizes
model = create_nthuku_fast_model(preset="150M") # 150M parameters
# Generate text from image
pixel_values = torch.randn(1, 3, 224, 224)
text = model.generate_text(
pixel_values,
max_length=100,
use_cache=True, # Enable prompt caching
show_thinking=False # Show reasoning traces
)
Model Presets
# 50M parameters (default)
model = create_nthuku_fast_model(preset="50M")
# 150M parameters (recommended)
model = create_nthuku_fast_model(preset="150M")
# 500M parameters (high capacity)
model = create_nthuku_fast_model(preset="500M")
# 1B parameters (maximum)
model = create_nthuku_fast_model(preset="1B")
Advanced Features
Prompt Caching
# Get cache statistics
stats = model.get_cache_stats()
print(f"Cache hit rate: {stats['hit_rate']:.2%}")
Speculative Decoding
from nthuku_fast import SpeculativeDecoder
spec_decoder = SpeculativeDecoder(model, num_speculative_tokens=4)
generated, stats = spec_decoder.generate(
input_ids, vision_features,
max_new_tokens=100,
show_stats=True
)
Thinking Traces
# Enable visible reasoning
text = model.generate_text(
pixel_values,
show_thinking=True # Shows step-by-step reasoning
)
Training
from nthuku_fast import train_nthuku_fast, MultiDatasetManager
# Load datasets
dataset_manager = MultiDatasetManager()
data_sources = dataset_manager.load_all_datasets()
# Train
results = train_nthuku_fast(
model=model,
data_sources=data_sources,
batch_size=8,
num_epochs=10,
learning_rate=2e-4
)
Performance
| Feature | Improvement |
|---|---|
| Flash Attention | 2-4x faster |
| Extended Context | 32x longer (8K tokens) |
| Optimized MoE | 20-30% faster |
| Prompt Caching | 10x cost reduction |
| Speculative Decoding | 2-3x faster generation |
Combined: 5-7x faster, 81% cheaper!
Requirements
- Python ≥ 3.8
- PyTorch ≥ 2.0.0 (for Flash Attention)
- transformers ≥ 4.30.0
- Other dependencies (auto-installed)
License
MIT License
Citation
@software{nthuku_fast,
title={Nthuku-Fast: Efficient Multimodal Vision-Language Model},
author={Nthuku Team},
year={2025},
url={https://github.com/elijahnzeli1/Nthuku-fast_v2}
}
Links
- GitHub: https://github.com/elijahnzeli1/Nthuku-fast_v2
- Documentation: [Coming soon]
- HuggingFace: https://huggingface.co/Qybera/nthuku-fast-1.5
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nthuku_fast-0.1.2.tar.gz
(39.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nthuku_fast-0.1.2.tar.gz.
File metadata
- Download URL: nthuku_fast-0.1.2.tar.gz
- Upload date:
- Size: 39.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
218d92d1330a76ab323c3eb7610e7c7dd9927481fea617dc1b1ab82cddcc37a6
|
|
| MD5 |
42f3f3e3f36ea3adeb3c9cd394a9621d
|
|
| BLAKE2b-256 |
88e6d362ae3c78e9cecc733df80dae3423dbe8417abfd4f8df143dc6f320d5e8
|
File details
Details for the file nthuku_fast-0.1.2-py3-none-any.whl.
File metadata
- Download URL: nthuku_fast-0.1.2-py3-none-any.whl
- Upload date:
- Size: 39.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd229b1ae28d2e59088570c6015f9553aadf95f257208d14470d497867b0b6a1
|
|
| MD5 |
f3f24980acc0a7dde58b2b165e50b6cb
|
|
| BLAKE2b-256 |
7cf0b7613878c36e6102052b98720cba9c8e92d418c4fa824dd087e77215da06
|