Skip to main content

Easy deployment of quantized llama models on cpu

Project description

glai

GGUF LLAMA AI - Package for simplified text generation with Llama models quantized to GGUF format

Provides high level APIs for loading models and generating text completions. Visit API documentation at: https://laelhalawani.github.io/glai/

High Level API Classes:

AutoAI:

  • Automatically searches for and loads a model based on name/quantization/keyword.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

EasyAI:

  • Allows manually configuring model data source - from file, URL, or ModelDB search.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

ModelDB (used by AutoAI and EasyAI):

  • Manages database of model data files. via ModelData class objects.
  • Useful for searching for models and retrieving model metadata.
  • Can import models from HuggingFace repo URLs or import and download models from .gguf urls on huggingface.

ModelData (used by ModelDB):

  • Represents metadata and info about a specific model.
  • Used by ModelDB to track and load models.
  • Can be initialized from URL, file, or ModelDB search.
  • Used by ModelDB to download model gguf file

Installation

To install the package use pip

pip install glai

Usage:

Usage examples.

Import package

from glai import AutoAI, EasyAI, ModelDB, ModelData

AutoAI - automatic model loading

ai = AutoAI(name_search="Mistral")
ai.generate("Hello") 

EasyAI - manual model configuration

easy = EasyAI()
easy.load_model_db()
easy.find_model_data(name_search="Mistral")
easy.load_ai()
easy.generate("Hello")

ModelDB - search models and show db info

from llgg import ModelDB
db = ModelDB()
model = db.find_model(name_search="Mistral")
print(model.name)
db.show_db_info()

Import Models from Repo

Import models from a HuggingFace repo into the model database:

from glai.back_end.model_db.db import ModelDB

mdb = ModelDB('./gguf_db', False)
mdb.import_models_from_repo(
    hf_repo_url="https://huggingface.co/TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF",
    user_tags=["[INST]", "[/INST]"],
    ai_tags=["", ""],
    description="We introduce SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.",
    keywords=["10.7B", "upstage","isntruct", "solar"],
    replace_existing=False,
)
mdb.show_db_info()

AutoAI Quick Example

Quickly generate using AutoAI:

from glai.ai import AutoAI

auto_ai = AutoAI("zephyr", "q2_k", new_tokens=50, max_input_tokens=100)
auto_ai.generate(
    user_message_text="Output just 'hi' in single quotes with no other prose. Do not include any additional information nor comments.",
    ai_message_to_be_continued= "'",
    stop_at="'",
    include_stop_str=True
)

EasyAI Step By Step Example

Step by step generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.load_model_db('./gguf_db')
easy_ai.find_model_data("zephyr", "q2_k")
easy_ai.load_ai()
easy_ai.generate(
    "Output a list of 3 strings. The first string should be `hi`, the second string should be `there`, and the third string should be `!`.",
    "['",
    "']"
)

EasyAI All In One Example

All in one generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.configure(
    model_db_dir="./gguf_db",
    name_search="zephyr",
    quantization_search="q2_k",
    new_tokens=50,
    max_input_tokens=100
)
easy_ai.generate(
    "Output a python list of 3 unique cat names.", 
    "['", 
    "']"
)

AutoAI from Dict Example

Generate from AutoAI using a config dict:

from glai.ai import AutoAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100  
}

AutoAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['", 
  "]", 
  True
)

EasyAI from Dict Example

Generate from EasyAI using a config dict:

from glai.ai import EasyAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100
}

EasyAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['",
  "']",
  True  
)

EasyAI from URL Example

Get a model from a URL and generate:

from glai.back_end.model_db.db import ModelDB
from glai.ai import EasyAI

mdb = ModelDB('./gguf_db', False)
mdb.show_db_info()

eai = EasyAI()
eai.load_model_db('./gguf_db')
eai.model_data_from_url(
    url="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/blob/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf",
    user_tags=("[INST]", "[/INST]"),
    ai_tags=("", ""),
    description="The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks we tested.",
    keywords=["mixtral", "8x7b", "instruct", "v0.1", "MoE"],
    save=True,
)
eai.load_ai()
eai.generate(
    user_message="Write a short joke that's actually super funny hilarious best joke.",
    ai_response_content_tbc="",
    stop_at=None,
    include_stop_str=True,
)

Detailed API documentation can be found here: https://laelhalawani.github.io/glai/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glai-0.0.15.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

glai-0.0.15-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file glai-0.0.15.tar.gz.

File metadata

  • Download URL: glai-0.0.15.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.15.tar.gz
Algorithm Hash digest
SHA256 1e1906694738f51af46a9d02f8b9cb164ffa62366953eebf3f4bcbd466f9535a
MD5 2c37319bf74a2d1affc48ebcc87b1315
BLAKE2b-256 b186020d668bf2264a0b17022ff58a7dfb6a1079b7fd99f7a10df41f98bd52c2

See more details on using hashes here.

File details

Details for the file glai-0.0.15-py3-none-any.whl.

File metadata

  • Download URL: glai-0.0.15-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 35ad5304e765c6b5fd8bbb85b1c93cd195f0c9b9ca3711f4bfb0616d24ce03aa
MD5 e596d37107a1f5f59d790ec21c4c0f33
BLAKE2b-256 b0c6e67882c41ff2e93fa29955db8e868ea6f796f2ff5dfb75d47c1b5aa8897b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page