Skip to main content

Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Project description

Unofficial implementation for the paper "Mixture-of-Depths"

Introduction

This is an unofficial implementation for the paper Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Currently supported models

Model Supported?
Mistral
Mixtral
LLama2
Gemma
Solar

💾 Installation

pip install mixture-of-depth

Both Linux, Windows and MacOS are supported.

🏁 Quick Start

High-level API (tranformers-compatible)

from transformers import AutoModelForCausalLM
from MoD import apply_mod_to_hf

# Initialize your model from an available hf model
model= AutoModelForCausalLM.from_pretrained("some-repo/some-model")
# Convert the model to include the mixture of depths layers
model = apply_mod_to_hf(model)
# train the model
# ...
# save the model
model.save_pretrained('some_local_directory')

Loading the converted Model

To utilize the converted model, you will need to load the model from the AutoClass. Below is an example demonstrating how to load the model from a local directory:

from MoD import AutoMoDModelForCausalLM

# Replace 'path_to_your_model' with the actual path to your model's directory
model = AutoMoDModelForCausalLM.from_pretrained('path_to_your_model')

🫱🏼‍🫲🏽 Contributing

We welcome contributions from the community, whether it's adding new features, improving documentation, or reporting bugs. Please refer to our contribution guidelines before making a pull request.

📜 License

This repo is open-sourced under the Apache-2.0 license.

Citation

If you use our code in your research, please cite it using the following Bibtex entry:

@article{MoD2024,
  title={Unofficial implementation for the paper "Mixture-of-Depths"},
  author={AstraMind AI},
  journal={https://github.com/astramind-ai/Mixture-of-depths},
  year={2024}
}

Support

For questions, issues, or support, please open an issue on our GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mixture-of-depth-1.0.0.tar.gz (48.5 kB view details)

Uploaded Source

Built Distribution

mixture_of_depth-1.0.0-py3-none-any.whl (51.9 kB view details)

Uploaded Python 3

File details

Details for the file mixture-of-depth-1.0.0.tar.gz.

File metadata

  • Download URL: mixture-of-depth-1.0.0.tar.gz
  • Upload date:
  • Size: 48.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for mixture-of-depth-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b24a876a5dd0059be6f09e852bc91f6f20fa3fd00ed27c4558e969f34702f122
MD5 52641466d52964ed7bfe9084f4212dc9
BLAKE2b-256 b4c702e65763839c36c3bc865dabb23a3486473ce53c98775cc79eda25f54889

See more details on using hashes here.

File details

Details for the file mixture_of_depth-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: mixture_of_depth-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 51.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for mixture_of_depth-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd0ef323e7f5f146d2739c400872563e61003f79e37492a42b8c842656522edb
MD5 e41c191b32452dc52a4dce0006eb53db
BLAKE2b-256 a28748f122ab407f947b656f6a67957744b56242d286f82aad8e81cf57d359ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page