A State Space and Mamba-based language model framework
Project description
CapibaraModel CLI
CapibaraModel is a command-line tool for training, evaluating, and deploying language models based on State Space and Mamba architectures, optimized for TPUs and featuring advanced hyperparameter optimization.
🚀 Key Features
-
Advanced Architectures:
- BitNet + Liquid Architecture
- Aleph-TILDE Module Integration
- Mamba SSM Architecture
- Capibara JAX SSM Implementation
-
Core Capabilities:
- Model training and evaluation
- Native TPU/GPU support
- Automatic hyperparameter optimization
- Integrated deployment system
- Performance measurement
- Docker containers (optional)
- Weights & Biases integration
📋 Requirements
- Python 3.9+
- JAX 0.4.13+
- CUDA 11.8+ (for GPU)
- TensorFlow 2.13+
- Weights & Biases
- Docker (optional)
🛠️ Installation
-
Clone this repository:
git clone https://github.com/anachroni-io/CapibaraModel-cli.git cd CapibaraModel-cli
-
Install dependencies:
pip install -r requirements.txt
-
Set up Weights & Biases:
wandb login
📖 Documentation
Full documentation available at Read the Docs:
- Quick start guide
- Complete tutorial
- API reference
- Usage examples
- Contribution guide
💻 Usage
capibara [options]
# Basic training
capibara --train
# Evaluation with specific layer
capibara --evaluate --new-layer BitNetLiquid
# Optimization with sub-model
capibara --optimize --sub-model AlephTilde
Available Options
--log-level
: Logging level (DEBUG, INFO, WARNING, ERROR)--train
: Train model--evaluate
: Evaluate model--optimize
: Hyperparameter optimization--deploy
: Deploy model--measure-performance
: Measure performance--model
: Path to model YAML file--new-layer
: Activate new layers--sub-model
: Specify sub-models
⚙️ Configuration
model:
name: "capibara-ent"
version: "2.0"
layers:
- type: "BitNetLiquid"
config:
hidden_size: 768
num_heads: 12
- type: "AlephTilde"
config:
rule_format: "prolog"
min_confidence: 0.8
🧪 Testing
# Unit tests
pytest tests/
# Integration tests
pytest tests/integration/
# Verify documentation
sphinx-build -b doctest docs/source/ docs/build/
📝 Citation
@software{capibara2024,
author = {Durán, Marco},
title = {CapibaraModel: A Large Language Model Framework},
year = {2024},
publisher = {GitHub},
url = {https://github.com/anachroni-io/CapibaraModel-cli}
}
📄 License
Distributed under the MIT License. See LICENSE
for more information.
📫 Contact
Marco Durán - marco@anachroni.co
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
capibara_model-1.2.3.tar.gz
(98.4 kB
view details)
Built Distribution
File details
Details for the file capibara_model-1.2.3.tar.gz
.
File metadata
- Download URL: capibara_model-1.2.3.tar.gz
- Upload date:
- Size: 98.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ffe481bf601e44c3123bf845dac68fe12696937c2aa21fdd9eff49d7d2df2c81 |
|
MD5 | 9900e10b8243e701b59c17ea206544fa |
|
BLAKE2b-256 | c018ac035b8349b2c557151ed1ece5101837653ec1648c74f80b716307689636 |
File details
Details for the file capibara_model-1.2.3-py3-none-any.whl
.
File metadata
- Download URL: capibara_model-1.2.3-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32639ce591b47751a1f57bbc495d1ca2716c74bf433959734e87af2a9bd77665 |
|
MD5 | 4eb904dbec6d954b2f2484bd0d43192f |
|
BLAKE2b-256 | 50fc7492b771269dffd049b4cfc72e1412d149be2d1a42364c65398911ea31b1 |