Skip to main content

A collection of tricks to speed up LLMs, see our transformer-tricks papers on arXiv

Project description

Colab Downloads

Setup

pip3 install transformer-tricks

To run llama and other LLMs that need an agreement (not SmolLM), you first have to type the following, which will ask for your hf_token:

huggingface-cli login

Example

The example below converts SmolLM-135M to FlashNorm and measures perplexity of the original and the modified model.

import transformer_tricks as tt

# convert model and store the new model in ./SmolLM-135M_flashNorm
tt.flashify_repo('HuggingFaceTB/SmolLM-135M')

# run example inference of original and modified model
tt.hello_world('HuggingFaceTB/SmolLM-135M')
tt.hello_world('SmolLM-135M_flashNorm')

# measure perplexity of original and modified model
tt.perplexity('HuggingFaceTB/SmolLM-135M', speedup=16)
tt.perplexity('SmolLM-135M_flashNorm', speedup=16)

Results:

Once upon a time there was a curious little girl
Once upon a time there was a curious little girl
perplexity = 16.083
perplexity = 16.083

You can run the example in your browser by clicking on this notebook: Colab . Hit "cancel" when it says "Notebook does not have secret access", because we don't need an HF_TOKEN for SmolLM.

Test FlashNorm

# setup
git clone https://github.com/OpenMachine-ai/transformer-tricks.git
cd python
pip3 install --quiet -r requirements.txt

# run tests
python3 flashNorm_test.py

Results:

Once upon a time there was a curious little girl
Once upon a time there was a curious little girl
Once upon a time there was a little girl named
Once upon a time there was a little girl named
perplexity = 16.083
perplexity = 16.083
perplexity = 12.086
perplexity = 12.086

Contributing

Before making a change to this repo, please do the following:

  • Format your code by typing autopep8 *.py. It's using the config in pyproject.toml.
  • Whenever you change transformer_tricks.py, publish a new version of the package as follows:
    • First, update the version number in pyproject.toml and in requirements.txt
    • Then, push the package to PyPi by typing ./push_pypi.sh
  • Whenever you modify flashNorm_example.py, generate the corresponding notebook as follows:
    jupytext --to ipynb flashNorm_example.py -o ../notebooks/flashNorm_example.ipynb
    sed -i -e 's/import transformer_tricks/%pip install --quiet transformer_tricks\\n", "import transformer_tricks/g'
      ../notebooks/flashNorm_example.ipynb
    

Notes on python package

  • Link to package here
  • Link to stats here
  • Source of this README file here

Please give us a ⭐ if you like this repo, thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transformer_tricks-0.2.1.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transformer_tricks-0.2.1-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file transformer_tricks-0.2.1.tar.gz.

File metadata

  • Download URL: transformer_tricks-0.2.1.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.9

File hashes

Hashes for transformer_tricks-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ee0435ec4eaed62100b0ce887aae26a230f4e5716f318a65d1476dfc3ab6ecdb
MD5 2860b7f0aa26d5f2b01e32d9b7b67dcd
BLAKE2b-256 3085c9e509be5d018be5dcd83998daa18447ba609e7a0a967b1a76e6d2cbfc28

See more details on using hashes here.

File details

Details for the file transformer_tricks-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for transformer_tricks-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1fbd5fc19a61e478bf29a0a77ac4bafa65faf5d15ca005938e1419399f554a6a
MD5 e4a75b3da9dc9a93357b59589c5fe8c2
BLAKE2b-256 75b423729b80b41a12591af9916b36c6af06f8ad3dae9b0d7b34b34ff0af132d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page