MLX transformers is a machine learning framework with similar Interface to Huggingface transformers.
Project description
MLX Transformers
MLX Transformers is a library that provides model implementations in MLX. It uses a similar model interface as HuggingFace Transformers and provides a way to load and use models in Apple Silicon devices. Implemented models have the same modules and module key as the original implementations in transformers.
MLX transformers is currently only available for inference on Apple Silicon devices. Training support will be added in the future.
Installation
This library is available on PyPI and can be installed using pip:
pip install mlx-transformers
It is also recommended to install asitop
which can be super useful for monitoring the GPU and CPU usage on Apple Silicon devices.
Models Supported
- Phi Family of Models (Phi3, Phi2, Phi)
- LLama
- Fuyu and Persimmon
- Machine Translation Models (NLLB, M2M-100)
- Encoder Models (Bert, RoBERTa, XLMRoberta, Sentence Transformers)
Chat Interface
MLX Transformers provides a streamlit chat interface that can be used to interact with the models. This template was adopted from https://github.com/da-z/mlx-ui.
The chat interface is available in the mlx_transformers/chat module and can be used as follows:
- cd chat
- bash start.sh
Quick Tour
A list of the available models can be found in the mlx_transformers.models module and are also listed in the section below. The following example demonstrates how to load a model and use it for inference:
-
You can load the model using MLX transformers in few lines of code
import mlx.core as mx from transformers import BertConfig, BertTokenizer from mlx_transformers.models import BertForMaskedLM as MLXBertForMaskedLM tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") config = BertConfig.from_pretrained("bert-base-uncased") model = MLXBertForMaskedLM(config) model.from_pretrained("bert-base-uncased") sample_input = "Hello, world!" inputs = tokenizer(sample_input, return_tensors="np") inputs = {key: mx.array(v) for key, v in inputs.items()} outputs = model(**inputs)
Sentence Transformer Example
import mlx.core as mx
import numpy as np
from transformers import AutoConfig, AutoTokenizer
from mlx_transformers.models import BertModel as MLXBertModel
def _mean_pooling(last_hidden_state: mx.array, attention_mask: mx.array):
token_embeddings = last_hidden_state
input_mask_expanded = mx.expand_dims(attention_mask, -1)
input_mask_expanded = mx.broadcast_to(input_mask_expanded, token_embeddings.shape).astype(mx.float32)
sum_embeddings = mx.sum(token_embeddings * input_mask_expanded, 1)
sum_mask = mx.clip(input_mask_expanded.sum(axis=1), 1e-9, None)
return sum_embeddings / sum_mask
sentences = ['This is an example sentence', 'Each sentence is converted']
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
config = AutoConfig.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
model = MLXBertModel(config)
model.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
inputs = tokenizer(sentences, return_tensors="np", padding=True, truncation=True)
inputs = {key: mx.array(v) for key, v in inputs.items()}
outputs = model(**inputs)
sentence_embeddings = _mean_pooling(outputs.last_hidden_state, inputs.attention_mask)
Other Examples
The examples directory contains a few examples that demonstrate how to use the models in MLX Transformers.
-
python3 examples/llama_generation.py --model-name "meta-llama/Llama-2-7b-hf"
-
python3 examples/translation/nllb_translation.py --model_name facebook/nllb-200-distilled-600M --source_language English --target_language Yoruba --text_to_translate "Let us translate text to Yoruba" Output:==> ['Ẹ jẹ́ ká tú àwọn ẹsẹ Bíbélì sí èdè Yoruba']
-
python3 examples/text_generation/phi3_generation.py --temp 1.0
Benchmarks
Coming soon...
Contributions
Contributions to MLX transformers are welcome. We would like to have as many model implementations as possible. See the contributing documentation for instructions on setting up a development environment.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlx_transformers-0.1.5.tar.gz.
File metadata
- Download URL: mlx_transformers-0.1.5.tar.gz
- Upload date:
- Size: 55.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fc46922e6bb3156047fbdc4cfb5878617b826a97e7bc4214b918ba6493f0344
|
|
| MD5 |
b72486784c6a986ae7702f2b1c9352f8
|
|
| BLAKE2b-256 |
509e6d13a9a1bd6af98c9ecd0ab656910d161096efdf913dff62ee568668b9a9
|
Provenance
The following attestation bundles were made for mlx_transformers-0.1.5.tar.gz:
Publisher:
publish.yml on ToluClassics/mlx-transformers
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_transformers-0.1.5.tar.gz -
Subject digest:
9fc46922e6bb3156047fbdc4cfb5878617b826a97e7bc4214b918ba6493f0344 - Sigstore transparency entry: 149946650
- Sigstore integration time:
-
Permalink:
ToluClassics/mlx-transformers@873e08f40b930c305e8f90175286dd4675e3419c -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/ToluClassics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@873e08f40b930c305e8f90175286dd4675e3419c -
Trigger Event:
release
-
Statement type:
File details
Details for the file mlx_transformers-0.1.5-py3-none-any.whl.
File metadata
- Download URL: mlx_transformers-0.1.5-py3-none-any.whl
- Upload date:
- Size: 64.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
905c2253bf6b96011e688b8b050e0a0373b99717bd16fd3b311202b0fc545130
|
|
| MD5 |
c59608c94ddc8b0df04e91ed1a4de06a
|
|
| BLAKE2b-256 |
5814746ac5b316a76ef47f9c0004d1dfff078498f953fee633ec307f5e30a28a
|
Provenance
The following attestation bundles were made for mlx_transformers-0.1.5-py3-none-any.whl:
Publisher:
publish.yml on ToluClassics/mlx-transformers
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_transformers-0.1.5-py3-none-any.whl -
Subject digest:
905c2253bf6b96011e688b8b050e0a0373b99717bd16fd3b311202b0fc545130 - Sigstore transparency entry: 149946652
- Sigstore integration time:
-
Permalink:
ToluClassics/mlx-transformers@873e08f40b930c305e8f90175286dd4675e3419c -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/ToluClassics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@873e08f40b930c305e8f90175286dd4675e3419c -
Trigger Event:
release
-
Statement type: