Skip to main content

TenMiNaTor — Framework de Deep Learning ultraligero con cuantización avanzada Q1-Q8, NVFP4, ANT4, TurboQuant y exportación GGUF/Chip/Unikernel

Project description

TenMiNaTor v2.0

Framework de Deep Learning ultraligero con cuantización avanzada

PyPI version Python 3.9+ License: MIT


Instalación

# Núcleo (solo numpy)
pip install tenminator

# Con entrenamiento (PyTorch + Transformers)
pip install tenminator[training]

# Con integración Unsloth
pip install tenminator[unsloth]

# Con API REST
pip install tenminator[api]

# Todo incluido
pip install tenminator[all]

Cuantización — Formatos soportados

Formato Bits Técnica VRAM 7B Calidad Uso recomendado
Q8 8 INT8 simétrico ~8 GB ★★★★★ Producción, máxima calidad
Q6 6 INT6 por grupo ~6 GB ★★★★★ Alta calidad, ahorro moderado
Q5 5 INT5 asimétrico ~5 GB ★★★★☆ Equilibrio calidad/tamaño
Q4_K_M 4 INT4 asimétrico ~4 GB ★★★★☆ Recomendado — uso general
Q3_TurboQuant 3 Hadamard + INT3 ~3 GB ★★★☆☆ GPUs con 8 GB VRAM
Q2_KIVI 2 KV-cache + outliers ~2 GB ★★☆☆☆ Edge devices
Q1_BitNet 1 Ternario {-1,0,1} ~1 GB ★☆☆☆☆ Investigación, chips especializados
NVFP4 4 Float4 Blackwell ~4 GB ★★★★☆ NVIDIA RTX 50xx / Vera Rubin
ANT4 4 Adaptive Numerical ~4 GB ★★★★★ NVIDIA Vera Rubin (ANT hardware)

Uso rápido

Cuantizar un modelo

from tenminator import Quantizer, QuantConfig
import numpy as np

# Simular pesos de un modelo
weights = {"layer1.weight": np.random.randn(4096, 4096).astype(np.float32)}

# Cuantizar a Q4 (recomendado)
q = Quantizer(QuantConfig(bits=4))
quantized = q.quantize_model(weights)

# Ver estadísticas
for name, qt in quantized.items():
    print(f"{name}: {qt.format_name}, {qt.data.nbytes / 1024:.1f} KB")

# Medir error
error = q.measure_error(weights["layer1.weight"], quantized["layer1.weight"])
print(f"MSE: {error['mse']:.6f} | Cosine: {error['cosine_similarity']:.4f}")

TurboQuant Q3 (Google, ICLR 2026)

from tenminator.quantization.formats import quantize_q3_turbo, dequantize

weights = np.random.randn(2048, 2048).astype(np.float32)

# Cuantizar con rotación Hadamard
qt = quantize_q3_turbo(weights, group_size=32)
print(f"Compresión: {weights.nbytes / qt.data.nbytes:.1f}x")

# Reconstruir
reconstructed = dequantize(qt)

Q1 BitNet (1-bit)

from tenminator.quantization.formats import quantize_q1_bitnet, dequantize

# Cuantización ternaria: {-1, 0, 1}
qt = quantize_q1_bitnet(weights)
print(f"Formato: {qt.format_name}")
print(f"Valores únicos: {np.unique(qt.data)}")  # [-1, 0, 1]

Exportar a GGUF

from tenminator.quantization.export import GGUFExporter

exporter = GGUFExporter()
path = exporter.export(quantized_model, "/tmp/modelo.gguf", arch="llama")
print(f"GGUF guardado: {path}")

Exportar para chip (Taalas / unikernel)

from tenminator.quantization.export import ChipExporter, UnikernelExporter

# Para chip Taalas
chip = ChipExporter()
path = chip.export(quantized_model, "/tmp/modelo.chip.bin", target_chip="taalas-v1")
area = chip.estimate_silicon_area(quantized_model)
print(f"Área estimada: {area['estimated_area_mm2']:.2f} mm²")

# Para unikernel (NanoVMs / Unikraft / Firecracker)
uni = UnikernelExporter()
path = uni.export(quantized_model, "/tmp/modelo.uni.bin", runtime="nanovms")

Entrenamiento

from tenminator import TrainingController, TrainingConfig

config = TrainingConfig(
    model_name="mi-modelo",
    learning_rate=1e-4,
    batch_size=4,
    max_steps=1000,
)

controller = TrainingController(config)
controller.start()

Integración con el ecosistema yoqer

TERMINATORI (inferencia)

from tenminator import TerminatoriBridge

bridge = TerminatoriBridge(base_url="http://localhost:8000")
response = bridge.chat("Explica la cuantización Q4")
print(response["content"])

TerminaTodo (almacenamiento)

from tenminator import TerminaTodoBridge

storage = TerminaTodoBridge(base_url="http://localhost:8001")
url = storage.upload_model("/tmp/modelo.gguf", "modelos/mi-modelo-q4.gguf")

Unsloth (fine-tuning eficiente)

from tenminator import UnslothBridge

bridge = UnslothBridge()
bridge.finetune(
    model_name="unsloth/llama-3-8b-bnb-4bit",
    dataset="mi_dataset.jsonl",
    output_dir="./modelo_finetuned",
    max_steps=500,
)

LangChain

from tenminator import LangChainRunnable

llm = LangChainRunnable(base_url="http://localhost:8000")
result = llm.invoke("¿Qué es TenMiNaTor?")

CLI

# Información del sistema
tenminator info

# Recomendar formato de cuantización
tenminator recommend --vram 16 --model-size 7 --quality high

# Cuantizar modelo
tenminator quantize --model modelo.safetensors --bits 4 --output modelo_q4.gguf

# Exportar para unikernel
tenminator export --model modelo_q4.bin --format unikernel --runtime nanovms

Ecosistema yoqer

Paquete PyPI Descripción
tenminator pip install tenminator Esta librería — entrenamiento y cuantización
terminatori pip install terminatori Motor de inferencia con panel web
terminatodo pip install terminatodo Gestión de almacenamiento multi-cloud
terminator pip install terminator-yoqer Framework de IA avanzado
teminaTor pip install teminaTor Framework de IA ligero

Licencia

MIT © yoqer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tenminator-2.0.0.tar.gz (47.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tenminator-2.0.0-py3-none-any.whl (47.3 kB view details)

Uploaded Python 3

File details

Details for the file tenminator-2.0.0.tar.gz.

File metadata

  • Download URL: tenminator-2.0.0.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.5

File hashes

Hashes for tenminator-2.0.0.tar.gz
Algorithm Hash digest
SHA256 7164ec1bf6b118e92a5135c3d7083494df4b5ca21ed4ce65e95f85b3b36928c6
MD5 5ed1047f6ae48848785f458491cb3910
BLAKE2b-256 f0d41ddb3ac4f6d475907cb2eec28f7edfa0e86b6d5306d232657ff5b60c787e

See more details on using hashes here.

File details

Details for the file tenminator-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: tenminator-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 47.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.5

File hashes

Hashes for tenminator-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94098cfb4379598c2cbfd9a4e6557c38bc0ced73fb2114e316bcd735f00575ec
MD5 445aa5e006cdf6013d3d07b634e661da
BLAKE2b-256 4819a44be1f446fb9494a668d7caca4977f1d15e99bf4b12267e78f3c5da5dcd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page