A State Space and Mamba-based language model framework
Project description
CapibaraModel CLI
CapibaraModel is a command-line tool for training, evaluating, and deploying language models based on State Space and Mamba architectures, optimized for TPUs and featuring advanced hyperparameter optimization.
🚀 Key Features
-
Advanced Architectures:
- BitNet + Liquid Architecture
- Aleph-TILDE Module Integration
- Mamba SSM Architecture
- Capibara JAX SSM Implementation
-
Core Capabilities:
- Model training and evaluation
- Native TPU/GPU support
- Automatic hyperparameter optimization
- Integrated deployment system
- Performance measurement
- Docker containers (optional)
- Weights & Biases integration
📋 Requirements
- Python 3.9+
- JAX 0.4.13+
- CUDA 11.8+ (for GPU)
- TensorFlow 2.13+
- Weights & Biases
- Docker (optional)
🛠️ Installation
-
Clone this repository:
git clone https://github.com/anachroni-io/CapibaraModel-cli.git cd CapibaraModel-cli
-
Install dependencies:
pip install -r requirements.txt
-
Set up Weights & Biases:
wandb login
📖 Documentation
Full documentation available at Read the Docs:
- Quick start guide
- Complete tutorial
- API reference
- Usage examples
- Contribution guide
💻 Usage
capibara [options]
# Basic training
capibara --train
# Evaluation with specific layer
capibara --evaluate --new-layer BitNetLiquid
# Optimization with sub-model
capibara --optimize --sub-model AlephTilde
Available Options
--log-level: Logging level (DEBUG, INFO, WARNING, ERROR)--train: Train model--evaluate: Evaluate model--optimize: Hyperparameter optimization--deploy: Deploy model--measure-performance: Measure performance--model: Path to model YAML file--new-layer: Activate new layers--sub-model: Specify sub-models
⚙️ Configuration
model:
name: "capibara-ent"
version: "2.0"
layers:
- type: "BitNetLiquid"
config:
hidden_size: 768
num_heads: 12
- type: "AlephTilde"
config:
rule_format: "prolog"
min_confidence: 0.8
🧪 Testing
# Unit tests
pytest tests/
# Integration tests
pytest tests/integration/
# Verify documentation
sphinx-build -b doctest docs/source/ docs/build/
📝 Citation
@software{capibara2024,
author = {Durán, Marco},
title = {CapibaraModel: A Large Language Model Framework},
year = {2024},
publisher = {GitHub},
url = {https://github.com/anachroni-io/CapibaraModel-cli}
}
📄 License
Distributed under the MIT License. See LICENSE for more information.
📫 Contact
Marco Durán - marco@anachroni.co
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file capibara_model-1.2.2.tar.gz.
File metadata
- Download URL: capibara_model-1.2.2.tar.gz
- Upload date:
- Size: 98.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
104576b33a4891887c1b24816e6dea8e4fe6f2953eebacd283b6f30b814e73a3
|
|
| MD5 |
509851d1ee3739554c71d367f98836e2
|
|
| BLAKE2b-256 |
502805cf6d905164c3fca2f86a01a3af4e435cee6bc84f3e055da8d447cf03c0
|
File details
Details for the file capibara_model-1.2.2-py3-none-any.whl.
File metadata
- Download URL: capibara_model-1.2.2-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
993453f4fe809729d21ea4acce09ccf02f9b4501c8fdfeb8de9e1fc1cb7ac37e
|
|
| MD5 |
0367672af98943fce2bc5c93332f82f7
|
|
| BLAKE2b-256 |
bbbe5e3e04e47a0e2ba545ae55b690cdaccfcd9e5ac2909ff7e8ae4bba29fc27
|