Skip to main content

Easy deployment of quantized llama models on cpu

Project description

glai - GGUF LLAMA AI - Package for simplified text generation with Llama models quantized to GGUF format is loaded.

Provides high level APIs for loading models and generating text completions.

High Level API Classes:

AutoAI:

  • Automatically searches for and loads a model based on name/quantization/keyword.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

EasyAI:

  • Allows manually configuring model data source - from file, URL, or ModelDB search.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

ModelDB (used by AutoAI and EasyAI):

  • Manages database of model data files. via ModelData class objects.
  • Useful for searching for models and retrieving model metadata.
  • Can import models from HuggingFace repo URLs or import and download models from .gguf urls on huggingface.

ModelData (used by ModelDB):

  • Represents metadata and info about a specific model.
  • Used by ModelDB to track and load models.
  • Can be initialized from URL, file, or ModelDB search.
  • Used by ModelDB to download model gguf file

Usage:

Import package

from glai import AutoAI, EasyAI, ModelDB, ModelData

AutoAI - automatic model loading

ai = AutoAI(name_search="Mistral")
ai.generate("Hello") 

EasyAI - manual model configuration

easy = EasyAI()
easy.load_model_db()
easy.find_model_data(name_search="Mistral")
easy.load_ai()
easy.generate("Hello")

ModelDB - search models and show db info

from llgg import ModelDB
db = ModelDB()
model = db.find_model(name_search="Mistral")
print(model.name)
db.show_db_info()

GGUF Examples Tutorial

This tutorial provides examples for using the GGUF package.

Import Models from Repo

Import models from a HuggingFace repo into the model database:

from glai.back_end.model_db.db import ModelDB

mdb = ModelDB('./gguf_db', False)
mdb.import_models_from_repo(
    hf_repo_url="https://huggingface.co/TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF",
    user_tags=["[INST]", "[/INST]"],
    ai_tags=["", ""],
    description="We introduce SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.",
    keywords=["10.7B", "upstage","isntruct", "solar"],
    replace_existing=False,
)
mdb.show_db_info()

AutoAI Quick Example

Quickly generate using AutoAI:

from glai.ai import AutoAI

auto_ai = AutoAI("zephyr", "q2_k", new_tokens=50, max_input_tokens=100)
auto_ai.generate(
    user_message_text="Output just 'hi' in single quotes with no other prose. Do not include any additional information nor comments.",
    ai_message_to_be_continued= "'",
    stop_at="'",
    include_stop_str=True
)

EasyAI Step By Step Example

Step by step generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.load_model_db('./gguf_db')
easy_ai.find_model_data("zephyr", "q2_k")
easy_ai.load_ai()
easy_ai.generate(
    "Output a list of 3 strings. The first string should be `hi`, the second string should be `there`, and the third string should be `!`.",
    "['",
    "']"
)

EasyAI All In One Example

All in one generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.configure(
    model_db_dir="./gguf_db",
    name_search="zephyr",
    quantization_search="q2_k",
    new_tokens=50,
    max_input_tokens=100
)
easy_ai.generate(
    "Output a python list of 3 unique cat names.", 
    "['", 
    "']"
)

AutoAI from Dict Example

Generate from AutoAI using a config dict:

from glai.ai import AutoAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100  
}

AutoAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['", 
  "]", 
  True
)

EasyAI from Dict Example

Generate from EasyAI using a config dict:

from glai.ai import EasyAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100
}

EasyAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['",
  "']",
  True  
)

EasyAI from URL Example

Get a model from a URL and generate:

from glai.back_end.model_db.db import ModelDB
from glai.ai import EasyAI

mdb = ModelDB('./gguf_db', False)
mdb.show_db_info()

eai = EasyAI()
eai.load_model_db('./gguf_db')
eai.model_data_from_url(
    url="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/blob/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf",
    user_tags=("[INST]", "[/INST]"),
    ai_tags=("", ""),
    description="The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks we tested.",
    keywords=["mixtral", "8x7b", "instruct", "v0.1", "MoE"],
    save=True,
)
eai.load_ai()
eai.generate(
    user_message="Write a short joke that's actually super funny hilarious best joke.",
    ai_response_content_tbc="",
    stop_at=None,
    include_stop_str=True,
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glai-0.0.11.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

glai-0.0.11-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file glai-0.0.11.tar.gz.

File metadata

  • Download URL: glai-0.0.11.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.11.tar.gz
Algorithm Hash digest
SHA256 b9c714f5e9ecc03502394436a4c8780c36759ebff20106df8c0c73879625a130
MD5 3e6368806c74671fac910e19e4129bca
BLAKE2b-256 8175bd8caf2c4fb4542aaeed64438fff048addaf049147b0b260fee480cbea49

See more details on using hashes here.

File details

Details for the file glai-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: glai-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 bc9936c5daa8bc37a068b2be2cf8bac9ff42f30dbd01deef8ffb47c927ea88f4
MD5 83cce7daebe1492ce4613dba2148204f
BLAKE2b-256 f863957c02e2f59e18be939113fb8200eec6f191054d63fcb45533cd48e95a46

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page