Skip to main content

1.58-bit Quantization + Test-Time Training (TTT) Implementation in Pure Rust

Project description

Bit-TTT Engine: High-Performance Brain Core

Rust License: MIT PyPI

4-bit Quantization + Test-Time Training (TTT) Implementation in Pure Rust.

✨ Features

  1. Fast: 40 tokens/second on GPU (RTX 4060 Ti).
  2. Adaptive (TTT): Learns while inferring - unique to Bit-TTT!
  3. Pure Rust: High performance with minimal dependencies.
  4. Easy: Load GGUF models directly.

🚀 Installation

pip install bit-ttt-engine

💻 Quick Start (GGUF Models)

from cortex_rust import GgufModel

# Load model
model = GgufModel("model.gguf", tokenizer="tokenizer.json")

# Generate text
output = model.generate(
    "Hello, how are you?",
    max_tokens=50,
    temperature=0.7
)
print(output)

# Streaming output
model.generate_with_callback(
    "Tell me a story",
    lambda t: print(t, end="", flush=True),
    max_tokens=100
)

🧠 TTT (Test-Time Training)

TTT makes the model adapt during inference - something no other local LLM can do!

from cortex_rust import GgufModel

model = GgufModel("model.gguf", tokenizer="tokenizer.json")

# Enable TTT
model.enable_ttt(layers=4, learning_rate=0.1)

# Without TTT: Pass 1 == Pass 2 (same output)
# With TTT:    Pass 1 != Pass 2 (model is learning!)

out1 = model.generate("My name is Alice.", max_tokens=20)
out2 = model.generate("My name is Alice.", max_tokens=20)
print(f"Different: {out1 != out2}")  # True!

# TTT controls
model.disable_ttt()
model.reset_ttt_state()
print(model.ttt_enabled)  # False

🏗️ Legacy API (BitLlama)

import cortex_rust

config = cortex_rust.BitLlamaConfig(
    vocab_size=32000,
    hidden_dim=512,
    num_layers=12,
    inner_lr=0.001
)

model = cortex_rust.BitLlama(
    config=config,
    checkpoint_path="path/to/model.safetensors",
    device="cpu",
    tokenizer_path="path/to/tokenizer.json"
)

output = model.generate(prompt="Hello!", max_tokens=50)

📖 Documentation

For more details, please visit the GitHub repository.

🙏 Acknowledgments

This project incorporates ideas and techniques inspired by the DroPE method published by Sakana AI.

💖 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bit_ttt_engine-0.6.2.tar.gz (239.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bit_ttt_engine-0.6.2-cp310-cp310-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.10Windows x86-64

File details

Details for the file bit_ttt_engine-0.6.2.tar.gz.

File metadata

  • Download URL: bit_ttt_engine-0.6.2.tar.gz
  • Upload date:
  • Size: 239.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.11.3

File hashes

Hashes for bit_ttt_engine-0.6.2.tar.gz
Algorithm Hash digest
SHA256 61ed2a1464ea46cd0ee5be278c1b85582658def9baade68e97c4f43387070728
MD5 ece99a0fe395e726a1a6d8772487f233
BLAKE2b-256 5747e8dc1d79b54232ec1a114042c87cb4f7df83edac56e3cc00728e20d52435

See more details on using hashes here.

File details

Details for the file bit_ttt_engine-0.6.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for bit_ttt_engine-0.6.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b40e5aab032744ad4496f6d594d431d199c0476504721a42bf82fd21acfd51e4
MD5 522358daa033fba53b5d10952f548a85
BLAKE2b-256 bd986730298bc69afed69f777a48a07a9771efac2a7d614825a36d22f8448b6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page