Skip to main content

Unofficial python bindings for llm-rs. 🐍❤️🦀

Project description

llm-rs-python: Python Bindings for Rust's llm Library

Welcome to llm-rs, an unofficial Python interface for the Rust-based llm library, made possible through PyO3. Our package combines the convenience of Python with the performance of Rust to offer an efficient tool for your machine learning projects. 🐍❤️🦀

With llm-rs, you can operate a variety of Large Language Models (LLMs) including LLama and GPT-NeoX directly on your CPU.

For a detailed overview of all the supported architectures, visit the llm project page.

Installation

Simply install it via pip: pip install llm-rs

Usage

Running local GGML models:

Models can be loaded via the AutoModel interface.

from llm_rs import AutoModel, KnownModels

#load the model
model = AutoModel.from_pretrained("path/to/model.bin",model_type=KnownModels.Llama)

#generate
print(model.generate("The meaning of life is"))

Streaming Text

Text can be yielded from a generator via the stream function:

from llm_rs import AutoModel, KnownModels

#load the model
model = AutoModel.from_pretrained("path/to/model.bin",model_type=KnownModels.Llama)

#generate
for token in model.stream("The meaning of life is"):
    print(token)

Running GGML models from the Hugging Face Hub

GGML converted models can be directly downloaded and run from the hub.

from llm_rs import AutoModel

model = AutoModel.from_pretrained("rustformers/mpt-7b-ggml",model_file="mpt-7b-q4_0-ggjt.bin")

If there are multiple models in a repo the model_file has to be specified. If you want to load repositories which were not created throught this library, you have to specify the model_type parameter as the metadata files needed to infer the architecture are missing.

Running Pytorch Transfomer models from the Hugging Face Hub

llm-rs supports automatic conversion of all supported transformer architectures on the Huggingface Hub.

To run covnersions additional dependencies are needed which can be installed via pip install llm-rs[convert].

The models can then be loaded and automatically converted via the from_pretrained function.

from llm_rs import AutoModel

model = AutoModel.from_pretrained("mosaicml/mpt-7b")

Convert Huggingface Hub Models

The following example shows how a Pythia model can be covnverted, quantized and run.

from llm_rs.convert import AutoConverter
from llm_rs import AutoModel, AutoQuantizer
import sys

#define the model which should be converted and an output directory
export_directory = "path/to/directory" 
base_model = "EleutherAI/pythia-410m"

#convert the model
converted_model = AutoConverter.convert(base_model, export_directory)

#quantize the model (this step is optional)
quantized_model = AutoQuantizer.quantize(converted_model)

#load the quantized model
model = AutoModel.load(quantized_model,verbose=True)

#generate text
def callback(text):
    print(text,end="")
    sys.stdout.flush()

model.generate("The meaning of life is",callback=callback)

Documentation

For in-depth information on customizing the loading and generation processes, refer to our detailed documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_rs-0.2.9.tar.gz (48.6 kB view hashes)

Uploaded Source

Built Distributions

llm_rs-0.2.9-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-pp39-pypy39_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

llm_rs-0.2.9-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-pp38-pypy38_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

llm_rs-0.2.9-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-pp37-pypy37_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp311-none-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.11 Windows x86-64

llm_rs-0.2.9-cp311-none-win32.whl (2.9 MB view hashes)

Uploaded CPython 3.11 Windows x86

llm_rs-0.2.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp310-none-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

llm_rs-0.2.9-cp310-none-win32.whl (2.9 MB view hashes)

Uploaded CPython 3.10 Windows x86

llm_rs-0.2.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp39-none-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

llm_rs-0.2.9-cp39-none-win32.whl (2.9 MB view hashes)

Uploaded CPython 3.9 Windows x86

llm_rs-0.2.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp38-none-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

llm_rs-0.2.9-cp38-none-win32.whl (2.9 MB view hashes)

Uploaded CPython 3.8 Windows x86

llm_rs-0.2.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

llm_rs-0.2.9-cp37-none-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.7 Windows x86-64

llm_rs-0.2.9-cp37-none-win32.whl (2.9 MB view hashes)

Uploaded CPython 3.7 Windows x86

llm_rs-0.2.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.3 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

llm_rs-0.2.9-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl (7.2 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page