Skip to main content

Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Project description

Unofficial implementation for the paper "Mixture-of-Depths"

Introduction

This is an unofficial implementation for the paper Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Currently supported models

Model Supported?
Mistral
Mixtral
LLama2
Gemma
Solar

💾 Installation

pip install mixture-of-depth

Both Linux, Windows and MacOS are supported.

🏁 Quick Start

High-level API (tranformers-compatible)

from transformers import AutoModelForCausalLM
from MoD import apply_mod_to_hf

# Initialize your model from an available hf model
model= AutoModelForCausalLM.from_pretrained("some-repo/some-model")
# Convert the model to include the mixture of depths layers
model = apply_mod_to_hf(model)
# train the model
# ...
# save the model
model.save_pretrained('some_local_directory')

Loading the converted Model

To utilize the converted model, you will need to load the model from the AutoClass. Below is an example demonstrating how to load the model from a local directory:

from MoD import AutoMoDModelForCausalLM

# Replace 'path_to_your_model' with the actual path to your model's directory
model = AutoMoDModelForCausalLM.from_pretrained('path_to_your_model')

🫱🏼‍🫲🏽 Contributing

We welcome contributions from the community, whether it's adding new features, improving documentation, or reporting bugs. Please refer to our contribution guidelines before making a pull request.

📜 License

This repo is open-sourced under the Apache-2.0 license.

Citation

If you use our code in your research, please cite it using the following Bibtex entry:

@article{MoD2024,
  title={Unofficial implementation for the paper "Mixture-of-Depths"},
  author={AstraMind AI},
  journal={https://github.com/astramind-ai/Mixture-of-depths},
  year={2024}
}

Support

For questions, issues, or support, please open an issue on our GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mixture-of-depth-0.0.1.tar.gz (48.5 kB view details)

Uploaded Source

Built Distribution

mixture_of_depth-0.0.1-py3-none-any.whl (51.9 kB view details)

Uploaded Python 3

File details

Details for the file mixture-of-depth-0.0.1.tar.gz.

File metadata

  • Download URL: mixture-of-depth-0.0.1.tar.gz
  • Upload date:
  • Size: 48.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for mixture-of-depth-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a3d0ac5c64850acf29f1c4807780d5c6364273f48bbcdd86b70b1001139fd51d
MD5 ca2749b241ac686866fc8c9edeb122ce
BLAKE2b-256 5b293f993e16a3d69ddcedf73b21c00bb07ec448bcd7501f8dac2fc4b70a19ea

See more details on using hashes here.

File details

Details for the file mixture_of_depth-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: mixture_of_depth-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 51.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for mixture_of_depth-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b086e65be1f7b3dc06e5946eacb130887f249a5c8f3b938391c549cb474d892
MD5 56501a1c4273649fff804f7f1179ef08
BLAKE2b-256 dcf9b11ebd1a455e4fd5093717b81f238ff5d2cb66035f4eeeef3e71257ada1b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page