Skip to main content

Fast and easy LLM serving.

Project description

mistral.rs

mistralrs is a Python package which provides an easy to use API for mistral.rs.

Example

More examples can be found here!

from mistralrs import Runner, Which, ChatCompletionRequest

runner = Runner(
    which=Which.Plain(
        model_id="microsoft/Phi-3.5-mini-instruct",
    ),
    in_situ_quant="Q4K",
)

res = runner.send_chat_completion_request(
    ChatCompletionRequest(
        model="mistral",
        messages=[
            {"role": "user", "content": "Tell me a story about the Rust type system."}
        ],
        max_tokens=256,
        presence_penalty=1.0,
        top_p=0.1,
        temperature=0.1,
    )
)
print(res.choices[0].message.content)
print(res.usage)

Multimodal (audio + image) example

mistralrs also supports multimodal vision models that can reason over both images and audio clips via the same OpenAI-style audio_url / image_url format. The example below queries the Phi-4-Multimodal model with a single image and an audio recording – notice how the text prompt references them via <|audio_1|> and <|image_1|> tokens (indexing starts at 1):

from mistralrs import Runner, Which, ChatCompletionRequest, VisionArchitecture

runner = Runner(
    which=Which.VisionPlain(
        model_id="microsoft/Phi-4-multimodal-instruct",
        arch=VisionArchitecture.Phi4MM,
    ),
)

IMAGE_URL = "https://www.allaboutbirds.org/guide/assets/og/528129121-1200px.jpg"
AUDIO_URL = "https://upload.wikimedia.org/wikipedia/commons/4/42/Bird_singing.ogg"

response = runner.send_chat_completion_request(
    ChatCompletionRequest(
        model="phi4mm",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "audio_url", "audio_url": {"url": AUDIO_URL}},
                    {"type": "image_url", "image_url": {"url": IMAGE_URL}},
                    {
                        "type": "text",
                        "text": "<|audio_1|><|image_1|> Describe in detail what is happening, referencing both what you hear and what you see.",
                    },
                ],
            }
        ],
        max_tokens=256,
        temperature=0.2,
        top_p=0.9,
    )
)

print(response.choices[0].message.content)

See examples/python/phi4mm_audio.py for a ready-to-run version.

Please find API docs here and the type stubs here, which are another great form of documentation.

We also provide a cookbook here!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mistralrs_cuda-0.6.0-cp312-none-win_amd64.whl (27.0 MB view details)

Uploaded CPython 3.12Windows x86-64

mistralrs_cuda-0.6.0-cp312-cp312-manylinux_2_39_x86_64.whl (33.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

mistralrs_cuda-0.6.0-cp311-none-win_amd64.whl (27.0 MB view details)

Uploaded CPython 3.11Windows x86-64

mistralrs_cuda-0.6.0-cp311-cp311-manylinux_2_39_x86_64.whl (33.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.39+ x86-64

mistralrs_cuda-0.6.0-cp310-none-win_amd64.whl (27.0 MB view details)

Uploaded CPython 3.10Windows x86-64

mistralrs_cuda-0.6.0-cp310-cp310-manylinux_2_39_x86_64.whl (33.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.39+ x86-64

File details

Details for the file mistralrs_cuda-0.6.0-cp312-none-win_amd64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp312-none-win_amd64.whl
Algorithm Hash digest
SHA256 d4920f91db58900f4e65771cbb94922dc3cdcb4b736cb216176c459997169e55
MD5 c567ecb2ddd1a763878732f75c70ff76
BLAKE2b-256 9ea930e5be46614a1a0b7ed36edfa323e1dac7b21930b35e1190efc69784815d

See more details on using hashes here.

File details

Details for the file mistralrs_cuda-0.6.0-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 b92b00d66a8d490fa89f0ab86f3bdf56f6b2a315dd3fd1d8230207c336b7c4f7
MD5 3003e45b4fe6b8b0aaa1ffa5cc109d86
BLAKE2b-256 d3939c57756b2e8d410d0bd56276de71a862506a265f91c7f6a366471fd0affd

See more details on using hashes here.

File details

Details for the file mistralrs_cuda-0.6.0-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 e243f53463d74a1b63b1b3a2663c606780ff19e960e030555ea92f74cb7a1f3a
MD5 6dbaec63eb2bd5abb6c7e938a3e8ae3e
BLAKE2b-256 a1474efb6d4e50ff9d42814974b9ee38fe2ea8fca510c2b612e5d17b6c462ef9

See more details on using hashes here.

File details

Details for the file mistralrs_cuda-0.6.0-cp311-cp311-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp311-cp311-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 b1ce36fe6c160c84faee76e88ddcd5cac27f3a0db360a6dde55466b7b894566b
MD5 81ab145e72e21504eda082fc7e3d669c
BLAKE2b-256 f3e0e7a558168e082c35f566588b6ed7927f9c03e4d5f8644f8b9aef1f342cf9

See more details on using hashes here.

File details

Details for the file mistralrs_cuda-0.6.0-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 b1224d14c4e5b7e102d60768d23c2f5b0b6039c825a5d51347b469e54982a3b4
MD5 29ef7cf333215122952c755a04da3f05
BLAKE2b-256 86e84ac4866769c1ee06915103eae9e892f1426abf44dde9d196b7358e6b5c80

See more details on using hashes here.

File details

Details for the file mistralrs_cuda-0.6.0-cp310-cp310-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for mistralrs_cuda-0.6.0-cp310-cp310-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 0b54b827929064332183cebb3d276cc469fea5ae609c1005555fe09313766b07
MD5 38dd1027cfefb6ba360b21c00f8b9df2
BLAKE2b-256 dea0d7a76384ddf1fa1f6c53db41c84b95b19e89cb6fe146246260b67d7ddd04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page