picoLLM Inference Engine

These details have not been verified by PyPI

Project links

Homepage

Project description

picoLLM Inference Engine Python Binding

Made in Vancouver, Canada by Picovoice

picoLLM Inference Engine

picoLLM Inference Engine is a highly accurate and cross-platform SDK optimized for running compressed large language models. picoLLM Inference Engine is:

Accurate; picoLLM Compression improves GPTQ by significant margins
Private; LLM inference runs 100% locally.
Cross-Platform
Runs on CPU and GPU
Free for open-weight models

Compatibility

Python 3.9+
Runs on Linux (x86_64), macOS (arm64, x86_64), Windows (x86_64, arm64), and Raspberry Pi (3, 4, 5).

Installation

pip3 install picollm

Models

picoLLM Inference Engine supports the following open-weight models. The models are on Picovoice Console.

DeepSeek-OCR-2
- deepseek-ocr-2
EmbeddingGemma
- embeddinggemma-300m
Gemma
- gemma-2b
- gemma-2b-it
- gemma-7b
- gemma-7b-it
Gemma3
- gemma-3-270m
- gemma-3-270m-it
Llama-2
- llama-2-7b
- llama-2-7b-chat
- llama-2-13b
- llama-2-13b-chat
- llama-2-70b
- llama-2-70b-chat
Llama-3
- llama-3-8b
- llama-3-8b-instruct
- llama-3-70b
- llama-3-70b-instruct
Llama-3.2
- llama3.2-1b-instruct
- llama3.2-3b-instruct
Mistral
- mistral-7b-v0.1
- mistral-7b-instruct-v0.1
- mistral-7b-instruct-v0.2
Mixtral
- mixtral-8x7b-v0.1
- mixtral-8x7b-instruct-v0.1
Phi-2
- phi2
Phi-3
- phi3
Phi-3.5
- phi3.5
Qwen3-VL
- qwen3-vl-2b-it

AccessKey

AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including picoLLM. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the LLM inference is running 100% offline and completely free for open-weight models. Everyone who signs up for Picovoice Console receives a unique AccessKey.

Usage

Text models

Create an instance of the engine and generate a prompt completion:

import picollm

pllm = picollm.create(
    access_key='${ACCESS_KEY}',
    model_path='${MODEL_PATH}')

res = pllm.generate(prompt='${PROMPT}')
print(res.completion)

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${MODEL_PATH} with the path to a model file downloaded from Picovoice Console, and ${PROMPT} with a prompt string.

Instruction-tuned models (e.g., llama-3-8b-instruct, llama-2-7b-chat, and gemma-2b-it) have a specific chat template. You can either directly format the prompt or use a dialog helper:

dialog = pllm.get_dialog()
dialog.add_human_request(prompt)

res = pllm.generate(prompt=dialog.prompt())
dialog.add_llm_response(res.completion)
print(res.completion)

To interrupt completion generation before it has finished:

pllm.interrupt()

Finally, when done, be sure to release the resources explicitly:

pllm.release()

Vision models

To run a VLM such as qwen3-vl-2b-it:

res = pllm.generate_with_image(
    prompt='${PROMPT}',
    image_width=${IMAGE_NUM_PIXELS_WIDTH},
    image_height=${IMAGE_NUM_PIXELS_HEIGHT},
    image=${IMAGE_DATA});
print(res.completion)

Replace ${PROMPT} with a text prompt. For the image, you will need to get image height and width in number of pixels and the raw pixel values of the image in 8-bit, RGB format.

OCR models

To run an OCR model such as deepseek-ocr-2:

res = pllm.generate_ocr(
    image_width=${IMAGE_NUM_PIXELS_WIDTH},
    image_height=${IMAGE_NUM_PIXELS_HEIGHT},
    image=${IMAGE_DATA});
print(res.completion)

For the image, you will need to get image height and width in number of pixels and the raw pixel values of the image in 8-bit, RGB format.

Embedding models

To run an embedding model such as embeddinggemma-300m:

res = pllm.generate_embeddings(prompt='${PROMPT}');
for embedding in range(len(res)):
  print(embedding)

Replace ${PROMPT} with a text prompt that you want to generate embeddings for.

Demos

picollmdemo provides command-line utilities for LLM completion and chat using picoLLM.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

2.1.0

Apr 24, 2026

2.0.1

Feb 13, 2026

2.0.0

Dec 18, 2025

1.3.1

May 13, 2025

1.3.0

Mar 14, 2025

1.2.5

Feb 13, 2025

1.2.4

Jan 13, 2025

1.2.3

Dec 20, 2024

1.2.2

Dec 17, 2024

1.2.1

Nov 26, 2024

1.2.0

Nov 14, 2024

1.1.0

Oct 1, 2024

1.0.0

May 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picollm-2.1.0.tar.gz (12.2 MB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

picollm-2.1.0-py3-none-any.whl (12.2 MB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file picollm-2.1.0.tar.gz.

File metadata

Download URL: picollm-2.1.0.tar.gz
Upload date: Apr 24, 2026
Size: 12.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for picollm-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ab1897dff04d339397f106a6cc636695b3e8f0780da726dac7ef35cfb5463abc`
MD5	`5b306422e5c06b03480306b083434a63`
BLAKE2b-256	`37deefc3d2eac2b89081f52473386726d1aab67712a3872c3521a00d6e6bee95`

See more details on using hashes here.

File details

Details for the file picollm-2.1.0-py3-none-any.whl.

File metadata

Download URL: picollm-2.1.0-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 12.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for picollm-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1cbea23a6b37d277850d42f65a79e6e5fc571fc69fd66b7e917042a76002982d`
MD5	`ea68317baff50c68c6ab4051105a981b`
BLAKE2b-256	`86e910289939d2b8deb3743b44b766df57ff8452a06e84379ba1c2bf24180e76`

See more details on using hashes here.

picollm 2.1.0

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

picoLLM Inference Engine Python Binding

picoLLM Inference Engine

Compatibility

Installation

Models

AccessKey

Usage

Text models

Vision models

OCR models

Embedding models

Demos

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes