Skip to main content

MLX transformers is a machine learning framework with similar Interface to Huggingface transformers.

Project description

MLX Transformers

PyPI

MLX Transformers is a library that provides model implementations in MLX. It uses a similar model interface as HuggingFace Transformers and provides a way to load and use models in Apple Silicon devices. Implemented models have the same modules and module key as the original implementations in transformers.

MLX transformers is currently only available for inference on Apple Silicon devices. Training support will be added in the future.

Installation

This library is available on PyPI and can be installed using pip:

pip install mlx-transformers

It is also recommended to install asitop which can be super useful for monitoring the GPU and CPU usage on Apple Silicon devices.

Models Supported

  • Phi Family of Models (Phi3, Phi2, Phi)
  • LLama
  • Fuyu and Persimmon
  • Machine Translation Models (NLLB, M2M-100)
  • Encoder Models (Bert, RoBERTa, XLMRoberta, Sentence Transformers)

Chat Interface

MLX Transformers provides a streamlit chat interface that can be used to interact with the models. This template was adopted from https://github.com/da-z/mlx-ui.

Chat Image

The chat interface is available in the mlx_transformers/chat module and can be used as follows:

- cd chat
- bash start.sh

Quick Tour

A list of the available models can be found in the mlx_transformers.models module and are also listed in the section below. The following example demonstrates how to load a model and use it for inference:

  • You can load the model using MLX transformers in few lines of code

    import mlx.core as mx
    from transformers import BertConfig, BertTokenizer
    from mlx_transformers.models import BertForMaskedLM as MLXBertForMaskedLM
    
    tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
    config = BertConfig.from_pretrained("bert-base-uncased")
    
    model = MLXBertForMaskedLM(config)
    model.from_pretrained("bert-base-uncased")
    
    sample_input = "Hello, world!"
    inputs = tokenizer(sample_input, return_tensors="np")
    inputs = {key: mx.array(v) for key, v in inputs.items()}
    
    outputs = model(**inputs)
    

Sentence Transformer Example

import mlx.core as mx
import numpy as np

from transformers import AutoConfig, AutoTokenizer
from mlx_transformers.models import BertModel as MLXBertModel


def _mean_pooling(last_hidden_state: mx.array, attention_mask: mx.array):
    token_embeddings = last_hidden_state
    input_mask_expanded = mx.expand_dims(attention_mask, -1)
    input_mask_expanded = mx.broadcast_to(input_mask_expanded, token_embeddings.shape).astype(mx.float32)
    sum_embeddings = mx.sum(token_embeddings * input_mask_expanded, 1)
    sum_mask = mx.clip(input_mask_expanded.sum(axis=1), 1e-9, None)
    return sum_embeddings / sum_mask

sentences = ['This is an example sentence', 'Each sentence is converted']

tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
config = AutoConfig.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

model = MLXBertModel(config)
model.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

inputs = tokenizer(sentences, return_tensors="np", padding=True, truncation=True)
inputs = {key: mx.array(v) for key, v in inputs.items()}

outputs = model(**inputs)

sentence_embeddings = _mean_pooling(outputs.last_hidden_state, inputs.attention_mask)

Other Examples

The examples directory contains a few examples that demonstrate how to use the models in MLX Transformers.

  1. LLama Example

    python3 examples/llama_generation.py --model-name "meta-llama/Llama-2-7b-hf"  
    
  2. NLLB Translation Example

    python3 examples/translation/nllb_translation.py --model_name facebook/nllb-200-distilled-600M --source_language English --target_language Yoruba --text_to_translate "Let us translate text to Yoruba"
    
    Output:==> ['Ẹ jẹ́ ká tú àwọn ẹsẹ Bíbélì sí èdè Yoruba']
    
  3. Phi Generation Example

    python3 examples/text_generation/phi3_generation.py --temp 1.0
    

Benchmarks

Coming soon...

Contributions

Contributions to MLX transformers are welcome. We would like to have as many model implementations as possible. See the contributing documentation for instructions on setting up a development environment.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_transformers-0.1.5.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

mlx_transformers-0.1.5-py3-none-any.whl (64.7 kB view details)

Uploaded Python 3

File details

Details for the file mlx_transformers-0.1.5.tar.gz.

File metadata

  • Download URL: mlx_transformers-0.1.5.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for mlx_transformers-0.1.5.tar.gz
Algorithm Hash digest
SHA256 9fc46922e6bb3156047fbdc4cfb5878617b826a97e7bc4214b918ba6493f0344
MD5 b72486784c6a986ae7702f2b1c9352f8
BLAKE2b-256 509e6d13a9a1bd6af98c9ecd0ab656910d161096efdf913dff62ee568668b9a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_transformers-0.1.5.tar.gz:

Publisher: publish.yml on ToluClassics/mlx-transformers

Attestations:

File details

Details for the file mlx_transformers-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for mlx_transformers-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 905c2253bf6b96011e688b8b050e0a0373b99717bd16fd3b311202b0fc545130
MD5 c59608c94ddc8b0df04e91ed1a4de06a
BLAKE2b-256 5814746ac5b316a76ef47f9c0004d1dfff078498f953fee633ec307f5e30a28a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_transformers-0.1.5-py3-none-any.whl:

Publisher: publish.yml on ToluClassics/mlx-transformers

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page