No project description provided

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Phi-3-Vision VLM Model for Apple MLX: An All-in-One Port

This project brings the powerful phi-3-vision VLM to Apple's MLX framework, offering a comprehensive solution for various text and image processing tasks. With a focus on simplicity and efficiency, this implementation offers a straightforward and minimalistic integration of the VLM model. It seamlessly incorporates essential functionalities such as generating quantized model weights, optimizing KV cache quantization during inference, facilitating LoRA/QLoRA training, and conducting model benchmarking, all encapsulated within a single file for convenient access and usage.

Key Features

Su-scaled RoPE: Implements Su-scaled Rotary Position Embeddings to manage sequences of up to 128K tokens.
Model Quantization: Reduce model size for faster loading and deployment (2.3GB quantized vs 8.5GB original).
KV Cache Quantization: Optimize inference for processing long contexts with minimal overhead (5.3s quantized vs 5.1s original).
LoRA Training: Easily customize the model for specific tasks or datasets using LoRA.
Benchmarking: Quickly assess model performance on any dataset (WIP).

Usage

prompt = "<|user|>\n<|image_1|>\nWhat is shown in this image?<|end|>\n<|assistant|>\n"
images = [Image.open(requests.get("https://assets-c4akfrf5b4d3f4b7.z01.azurefd.net/assets/2024/04/BMDataViz_661fb89f3845e.png" , stream=True).raw)]

Image Captioning

model, processor = load()
generate(model, processor, prompt, images)

The image displays a bar chart showing the percentage of
4.43s user 3.17s system 71% cpu 10.711 total

Cache Quantization

model, processor = load(use_quantized_cache=True)
print(generate(model, processor,  "<|user|>Write an exciting sci-fi.<|end|>\n<|assistant|>\n"))

Title: The Last Frontier\n\nIn the
2.49s user 4.52s system 131% cpu 5.325 total

Model Quantization

quantize(from_path='phi3v', to_path='quantized_phi3v', q_group_size=64, q_bits=4)

4.30s user 3.31s system 119% cpu 6.368 total

model, processor = load(model_path='quantized_phi3v')
print(generate(model, processor, "<|user|>Write an exciting sci-fi.<|end|>\n<|assistant|>\n"))

Title: The Quantum Leap\n\nIn
3.78s user 0.87s system 205% cpu 2.264 total

LoRA Training

train_lora()

22.50s user 27.58s system 22% cpu 3:41.58 total

Alt text

Benchmarking (WIP)

recall()

10.65s user 10.98s system 37% cpu 57.669 total

Installation

git clone https://github.com/JosefAlbers/Phi-3-Vision-MLX.git

Benchmarks

Task	Vanilla Model	Quantized Model	Quantized KV Cache	LoRA Adapter
Image Captioning	10.71s	8.51s	12.79s	11.70s
Text Generation	5.07s	2.24s	5.27s	5.10s

License

This project is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.6

Jun 23, 2024

0.0.6b4 pre-release

Jun 23, 2024

0.0.6b3 pre-release

Jun 23, 2024

0.0.6b2 pre-release

Jun 23, 2024

0.0.6b1 pre-release

Jun 23, 2024

0.0.6b0 pre-release

Jun 23, 2024

0.0.6a0 pre-release

Jun 22, 2024

0.0.5

Jun 16, 2024

0.0.4

Jun 16, 2024

0.0.3

Jun 16, 2024

0.0.3rc1 pre-release

Jun 16, 2024

0.0.3b0 pre-release

Jun 9, 2024

0.0.3a0 pre-release

Jun 8, 2024

This version

0.0.2

May 31, 2024

0.0.2rc2 pre-release

Jun 3, 2024

0.0.2rc1 pre-release

Jun 3, 2024

0.0.2b0 pre-release

Jun 2, 2024

0.0.1

May 31, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phi_3_vision_mlx-0.0.2.tar.gz (11.8 kB view hashes)

Uploaded May 31, 2024 Source

Built Distribution

phi_3_vision_mlx-0.0.2-py3-none-any.whl (11.7 kB view hashes)

Uploaded May 31, 2024 Python 3

Hashes for phi_3_vision_mlx-0.0.2.tar.gz

Hashes for phi_3_vision_mlx-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`e1fbeb82e4931ddbdbb5a543f99674eaaac9182fee2c34942bfba209ef32816a`
MD5	`b965ebf95305e9ac16db4f983231a836`
BLAKE2b-256	`e193259c8307f4ad0cef86c333089e602aa76c5f391ec0f7bb332b8b7ab87815`

Hashes for phi_3_vision_mlx-0.0.2-py3-none-any.whl

Hashes for phi_3_vision_mlx-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7d152050b2acd7e4a8bc322dc4637d5c16aec3ce3eb32ba23ff0ebfde3c3fb20`
MD5	`4cb6e8c33f1dc591c9d27547fe62586f`
BLAKE2b-256	`047bb89dffe4f733ce0472332cc08d46a8fc327f0218849c062eef8d004d6437`