Skip to main content

A State Space and Mamba-based language model framework

Project description

CapibaraModel CLI

Capibara SSBD Model

CapibaraModel is a command-line tool for training, evaluating, and deploying language models based on State Space and Mamba architectures, optimized for TPUs and featuring advanced hyperparameter optimization.

🚀 Key Features

  • Advanced Architectures:

    • BitNet + Liquid Architecture
    • Aleph-TILDE Module Integration
    • Mamba SSM Architecture
    • Capibara JAX SSM Implementation
  • Core Capabilities:

    • Model training and evaluation
    • Native TPU/GPU support
    • Automatic hyperparameter optimization
    • Integrated deployment system
    • Performance measurement
    • Docker containers (optional)
    • Weights & Biases integration

📋 Requirements

  • Python 3.9+
  • JAX 0.4.13+
  • CUDA 11.8+ (for GPU)
  • TensorFlow 2.13+
  • Weights & Biases
  • Docker (optional)

🛠️ Installation

  1. Clone this repository:

    git clone https://github.com/anachroni-io/CapibaraModel-cli.git
    cd CapibaraModel-cli
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up Weights & Biases:

    wandb login
    

📖 Documentation

Full documentation available at Read the Docs:

  • Quick start guide
  • Complete tutorial
  • API reference
  • Usage examples
  • Contribution guide

💻 Usage

capibara [options]

# Basic training
capibara --train

# Evaluation with specific layer
capibara --evaluate --new-layer BitNetLiquid

# Optimization with sub-model
capibara --optimize --sub-model AlephTilde

Available Options

  • --log-level: Logging level (DEBUG, INFO, WARNING, ERROR)
  • --train: Train model
  • --evaluate: Evaluate model
  • --optimize: Hyperparameter optimization
  • --deploy: Deploy model
  • --measure-performance: Measure performance
  • --model: Path to model YAML file
  • --new-layer: Activate new layers
  • --sub-model: Specify sub-models

⚙️ Configuration

model:
  name: "capibara-ent"
  version: "2.0"
  layers:
    - type: "BitNetLiquid"
      config:
        hidden_size: 768
        num_heads: 12
    - type: "AlephTilde"
      config:
        rule_format: "prolog"
        min_confidence: 0.8

🧪 Testing

# Unit tests
pytest tests/

# Integration tests
pytest tests/integration/

# Verify documentation
sphinx-build -b doctest docs/source/ docs/build/

📝 Citation

@software{capibara2024,
  author = {Durán, Marco},
  title = {CapibaraModel: A Large Language Model Framework},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/anachroni-io/CapibaraModel-cli}
}

📄 License

Distributed under the MIT License. See LICENSE for more information.

📫 Contact

Marco Durán - marco@anachroni.co

Website | GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capibara_model-1.2.3.tar.gz (98.4 kB view details)

Uploaded Source

Built Distribution

capibara_model-1.2.3-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file capibara_model-1.2.3.tar.gz.

File metadata

  • Download URL: capibara_model-1.2.3.tar.gz
  • Upload date:
  • Size: 98.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for capibara_model-1.2.3.tar.gz
Algorithm Hash digest
SHA256 ffe481bf601e44c3123bf845dac68fe12696937c2aa21fdd9eff49d7d2df2c81
MD5 9900e10b8243e701b59c17ea206544fa
BLAKE2b-256 c018ac035b8349b2c557151ed1ece5101837653ec1648c74f80b716307689636

See more details on using hashes here.

File details

Details for the file capibara_model-1.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for capibara_model-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 32639ce591b47751a1f57bbc495d1ca2716c74bf433959734e87af2a9bd77665
MD5 4eb904dbec6d954b2f2484bd0d43192f
BLAKE2b-256 50fc7492b771269dffd049b4cfc72e1412d149be2d1a42364c65398911ea31b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page