Skip to main content

picoLLM Inference Engine demos

Project description

picoLLM Inference Engine Python Demos

Made in Vancouver, Canada by Picovoice

picoLLM Inference Engine

picoLLM Inference Engine is a highly accurate and cross-platform SDK optimized for running compressed large language models. picoLLM Inference Engine is:

  • Accurate; picoLLM Compression improves GPTQ by significant margins
  • Private; LLM inference runs 100% locally.
  • Cross-Platform
  • Runs on CPU and GPU
  • Free for open-weight models

Compatibility

  • Python 3.8+
  • Runs on Linux (x86_64), macOS (arm64, x86_64), Windows (x86_64), and Raspberry Pi (5 and 4).

Installation

pip3 install picollmdemo

Models

picoLLM Inference Engine supports the following open-weight models. The models are on Picovoice Console.

  • Gemma
    • gemma-2b
    • gemma-2b-it
    • gemma-7b
    • gemma-7b-it
  • Llama-2
    • llama-2-7b
    • llama-2-7b-chat
    • llama-2-13b
    • llama-2-13b-chat
    • llama-2-70b
    • llama-2-70b-chat
  • Llama-3
    • llama-3-8b
    • llama-3-8b-instruct
    • llama-3-70b
    • llama-3-70b-instruct
  • Mistral
    • mistral-7b-v0.1
    • mistral-7b-instruct-v0.1
    • mistral-7b-instruct-v0.2
  • Mixtral
    • mixtral-8x7b-v0.1
    • mixtral-8x7b-instruct-v0.1
  • Phi-2
    • phi2
  • Phi-3
    • phi3

AccessKey

AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including picoLLM. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the LLM inference is running 100% offline and completely free for open-weight models. Everyone who signs up for Picovoice Console receives a unique AccessKey.

Usage

There are two demos available: completion and chat. The completion demo accepts a prompt and a set of optional parameters and generates a single completion. It can run all models, whether instruction-tuned or not. The chat demo can run instruction-tuned (chat) models such as llama-3-8b-instruct, phi2, etc. The chat demo enables a back-and-forth conversation with the LLM, similar to ChatGPT.

Completion Demo

Run the demo by entering the following in the terminal:

picollm_demo_completion --access_key ${ACCESS_KEY} --model_path ${MODEL_PATH} --prompt ${PROMPT}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${MODEL_PATH} with the path to a model file downloaded from Picovoice Console, and ${PROMPT} with a prompt string.

To get information about all the available options in the demo, run the following:

picollm_demo_completion --help

Chat Demo

To run an instruction-tuned model for chat, run the following in the terminal:

picollm_demo_chat --access_key ${ACCESS_KEY} --model_path ${MODEL_PATH}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${MODEL_PATH} with the path to a model file downloaded from Picovoice Console.

To get information about all the available options in the demo, run the following:

picollm_demo_chat --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picollmdemo-1.1.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

picollmdemo-1.1.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file picollmdemo-1.1.0.tar.gz.

File metadata

  • Download URL: picollmdemo-1.1.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.12

File hashes

Hashes for picollmdemo-1.1.0.tar.gz
Algorithm Hash digest
SHA256 19bd12fda3bb5b724095eb86d7aecc06a0d92a8baab815d4591ac4b57a37d6dc
MD5 37c3a7ac3b8e3e55a4ac83a374e87b52
BLAKE2b-256 4ad4eb70dece9673206e9f2d9a5d2b575ea9907fc49a9d960325d040e27a4f03

See more details on using hashes here.

File details

Details for the file picollmdemo-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: picollmdemo-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.12

File hashes

Hashes for picollmdemo-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd8af74c82e087b361982e54964e90ffe26a6ad4737e10fecd74ffb8c62aeba3
MD5 ce1543be7e6d8b2c1d262c518dff7d78
BLAKE2b-256 7752115af964661fb54cdc1144fcbade2b65b4ac72259e0ad266052fda8429fd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page