Fast and easy LLM serving.

These details have not been verified by PyPI

Project links

Project description

mistral.rs PyO3 Bindings: `mistralrs`

mistralrs is a Python package which provides an API for mistral.rs. We build mistralrs with the maturin build manager.

Installation from PyPi

Install Rust: https://rustup.rs/

Example on Ubuntu:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

mistralrs depends on the openssl library.

Example on Ubuntu:

sudo apt install libssl-dev
sudo apt install pkg-config

Install it!

CUDA

pip install mistralrs-cuda -v
Metal

pip install mistralrs-metal -v
Apple Accelerate

pip install mistralrs-accelerate -v
Intel MKL

pip install mistralrs-mkl -v
Without accelerators

pip install mistralrs -v

All installations will install the mistralrs package. The suffix on the package installed by pip only controls the feature activation.

Installation from source

Install required packages
- openssl (Example on Ubuntu: sudo apt install libssl-dev)
- pkg-config (Example on Ubuntu: sudo apt install pkg-config)

Install Rust: https://rustup.rs/

Example on Ubuntu:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

Set HF token correctly (skip if already set or your model is not gated, or if you want to use the token_source parameters in Python or the command line.)

Example on Ubuntu:
```
mkdir ~/.cache/huggingface
touch ~/.cache/huggingface/token
echo <HF_TOKEN_HERE> > ~/.cache/huggingface/token
```

Download the code

git clone https://github.com/EricLBuehler/mistral.rs.git
cd mistral.rs

cd into the correct directory for building mistralrs: cd mistralrs-pyo3
Install maturin, our Rust + Python build system: Maturin requires a Python virtual environment such as venv or conda to be active. The mistralrs package will be installed into that environment.
```
pip install maturin[patchelf]
```
Install mistralrs Install mistralrs by executing the following in this directory where features such as cuda or flash-attn may be specified with the --features argument just like they would be for cargo run.

The base build command is:
```
maturin develop -r
```
- To build for CUDA:
```
maturin develop -r --features cuda
```
- To build for CUDA with flash attention:
```
maturin develop -r --features "cuda flash-attn"
```
- To build for Metal:
```
maturin develop -r --features metal
```
- To build for Accelerate:
```
maturin develop -r --features accelerate
```
- To build for MKL:
```
maturin develop -r --features mkl
```

Please find API docs here and the type stubs here, which are another great form of documentation.

We also provide a cookbook here!

Example

from mistralrs import Runner, Which, ChatCompletionRequest

runner = Runner(
    which=Which.GGUF(
        tok_model_id="mistralai/Mistral-7B-Instruct-v0.1",
        quantized_model_id="TheBloke/Mistral-7B-Instruct-v0.1-GGUF",
        quantized_filename="mistral-7b-instruct-v0.1.Q4_K_M.gguf",
    )
)

res = runner.send_chat_completion_request(
    ChatCompletionRequest(
        model="mistral",
        messages=[
            {"role": "user", "content": "Tell me a story about the Rust type system."}
        ],
        max_tokens=256,
        presence_penalty=1.0,
        top_p=0.1,
        temperature=0.1,
    )
)
print(res.choices[0].message.content)
print(res.usage)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.2

Oct 28, 2024

0.3.1

Sep 29, 2024

This version

0.3.0

Sep 2, 2024

0.2.5

Aug 16, 2024

0.2.4

Aug 1, 2024

0.2.3

Jul 28, 2024

0.2.2

Jul 24, 2024

0.2.1

Jul 23, 2024

0.2.0

Jul 19, 2024

0.1.24

Jun 30, 2024

0.1.23

Jun 29, 2024

0.1.22

Jun 24, 2024

0.1.21

Jun 23, 2024

0.1.20

Jun 20, 2024

0.1.19

Jun 15, 2024

0.1.18

Jun 12, 2024

0.1.17

Jun 11, 2024

0.1.16

Jun 8, 2024

0.1.15

Jun 5, 2024

0.1.14

Jun 5, 2024

0.1.13

Jun 2, 2024

0.1.11

May 28, 2024

0.1.10

May 22, 2024

0.1.8

May 16, 2024

0.1.7

May 14, 2024

0.1.5

May 13, 2024

0.1.4

May 8, 2024

0.1.3

May 2, 2024

0.1.2

Apr 30, 2024

0.1.1

Apr 27, 2024

0.1.0

Apr 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mistralrs_cuda-0.3.0.tar.gz (385.9 kB view details)

Uploaded Sep 2, 2024 Source

File details

Details for the file mistralrs_cuda-0.3.0.tar.gz.

File metadata

Download URL: mistralrs_cuda-0.3.0.tar.gz
Upload date: Sep 2, 2024
Size: 385.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.8.18

File hashes

Hashes for mistralrs_cuda-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`281c30bdb8b637b34644a6999563ca271803af97b18d8ce8c3e5ed71d0caa443`
MD5	`ae11cd3f4275fea4610b375d6ceb0444`
BLAKE2b-256	`cd1ba1b63039391ffebdb430918652232ff6074aa6fb4f0c811c9fd5892300b6`

See more details on using hashes here.

mistralrs-cuda 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mistral.rs PyO3 Bindings: `mistralrs`

Installation from PyPi

Installation from source

Example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

mistralrs-cuda 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mistral.rs PyO3 Bindings: mistralrs

Installation from PyPi

Installation from source

Example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

mistral.rs PyO3 Bindings: `mistralrs`