Skip to main content

A simple and efficient python library for fast inference of GGUF Large Language Models.

Project description

ALLM

ALLM is a Python library designed for fast inference of GGUF (Generic Global Unsupervised Features) Large Language Models (LLMs) on both CPU and GPU. It provides a convenient interface for loading pre-trained GGUF models and performing inference using them. This library is ideal for applications where quick response times are crucial, such as chatbots, text generation, and more.

Features

  • Efficient Inference: ALLM leverages the power of GGUF models to provide fast and accurate inference.
  • CPU and GPU Support: The library is optimized for both CPU and GPU, allowing you to choose the best hardware for your application.
  • Simple Interface: With a straightforward command line support, you can easily load models and perform inference with just a single command.
  • Flexible Configuration: Customize inference settings such as temperature and model path to suit your needs.

Installation

You can install ALLM using pip:

pip install allm

Usage

You can start inference with a simple 'allm-run' command. The command takes name or path, temperature(optional), max new tokens(optional) and additional model kwargs(optional) as arguments.

allm-run --name model_name_or_path

API

After initialising or downloading the model you can start inference API with a simple 'allm-serve' command. The command takes host and port as optional arguments, if not provided, the API will run on the default 127.0.0.1:5000 host.

allm-serve
allm-serve --host 192.168.0.1 --port 8000

Supported Model names

Llama2, llama, llama2_chat, Llama_chat, Mistral, Mistral_instruct

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ALLMDEV-1.2-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file ALLMDEV-1.2-py3-none-any.whl.

File metadata

  • Download URL: ALLMDEV-1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for ALLMDEV-1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 01f06807af25ae3265bdcf01aca5d2edacb7334eaf6d795202ca79893555931b
MD5 de8de665ae02fd4370a4dbc359da337c
BLAKE2b-256 b16487acfc71108711d575f04281a118ae7860d75c2c2ca4a7b24d10ca3c67c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page