Skip to main content

A State Space and Mamba-based language model framework

Project description

CapibaraModel CLI

Capibara SSBD Model

CapibaraModel is a command-line tool for training, evaluating, and deploying language models based on State Space and Mamba architectures, optimized for TPUs and featuring advanced hyperparameter optimization.

🚀 Key Features

  • Advanced Architectures:

    • BitNet + Liquid Architecture
    • Aleph-TILDE Module Integration
    • Mamba SSM Architecture
    • Capibara JAX SSM Implementation
  • Core Capabilities:

    • Model training and evaluation
    • Native TPU/GPU support
    • Automatic hyperparameter optimization
    • Integrated deployment system
    • Performance measurement
    • Docker containers (optional)
    • Weights & Biases integration

📋 Requirements

  • Python 3.9+
  • JAX 0.4.13+
  • CUDA 11.8+ (for GPU)
  • TensorFlow 2.13+
  • Weights & Biases
  • Docker (optional)

🛠️ Installation

  1. Clone this repository:

    git clone https://github.com/anachroni-io/CapibaraModel-cli.git
    cd CapibaraModel-cli
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up Weights & Biases:

    wandb login
    

📖 Documentation

Full documentation available at Read the Docs:

  • Quick start guide
  • Complete tutorial
  • API reference
  • Usage examples
  • Contribution guide

💻 Usage

capibara [options]

# Basic training
capibara --train

# Evaluation with specific layer
capibara --evaluate --new-layer BitNetLiquid

# Optimization with sub-model
capibara --optimize --sub-model AlephTilde

Available Options

  • --log-level: Logging level (DEBUG, INFO, WARNING, ERROR)
  • --train: Train model
  • --evaluate: Evaluate model
  • --optimize: Hyperparameter optimization
  • --deploy: Deploy model
  • --measure-performance: Measure performance
  • --model: Path to model YAML file
  • --new-layer: Activate new layers
  • --sub-model: Specify sub-models

⚙️ Configuration

model:
  name: "capibara-ent"
  version: "2.0"
  layers:
    - type: "BitNetLiquid"
      config:
        hidden_size: 768
        num_heads: 12
    - type: "AlephTilde"
      config:
        rule_format: "prolog"
        min_confidence: 0.8

🧪 Testing

# Unit tests
pytest tests/

# Integration tests
pytest tests/integration/

# Verify documentation
sphinx-build -b doctest docs/source/ docs/build/

📝 Citation

@software{capibara2024,
  author = {Durán, Marco},
  title = {CapibaraModel: A Large Language Model Framework},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/anachroni-io/CapibaraModel-cli}
}

📄 License

Distributed under the MIT License. See LICENSE for more information.

📫 Contact

Marco Durán - marco@anachroni.co

Website | GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capibara_model-1.2.2.tar.gz (98.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

capibara_model-1.2.2-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file capibara_model-1.2.2.tar.gz.

File metadata

  • Download URL: capibara_model-1.2.2.tar.gz
  • Upload date:
  • Size: 98.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for capibara_model-1.2.2.tar.gz
Algorithm Hash digest
SHA256 104576b33a4891887c1b24816e6dea8e4fe6f2953eebacd283b6f30b814e73a3
MD5 509851d1ee3739554c71d367f98836e2
BLAKE2b-256 502805cf6d905164c3fca2f86a01a3af4e435cee6bc84f3e055da8d447cf03c0

See more details on using hashes here.

File details

Details for the file capibara_model-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: capibara_model-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for capibara_model-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 993453f4fe809729d21ea4acce09ccf02f9b4501c8fdfeb8de9e1fc1cb7ac37e
MD5 0367672af98943fce2bc5c93332f82f7
BLAKE2b-256 bbbe5e3e04e47a0e2ba545ae55b690cdaccfcd9e5ac2909ff7e8ae4bba29fc27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page