Skip to main content

Easy deployment of quantized llama models on cpu

Project description

glai

GGUF LLAMA AI - Package for simplified text generation with Llama models quantized to GGUF format

Provides high level APIs for loading models and generating text completions. Visit API documentation at: https://laelhalawani.github.io/glai/

High Level API Classes:

AutoAI:

  • Automatically searches for and loads a model based on name/quantization/keyword.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

EasyAI:

  • Allows manually configuring model data source - from file, URL, or ModelDB search.
  • Handles downloading model data, loading it to memory, and configuring message formatting.
  • Use generate() method to get completions by providing a user message.

ModelDB (used by AutoAI and EasyAI):

  • Manages database of model data files. via ModelData class objects.
  • Useful for searching for models and retrieving model metadata.
  • Can import models from HuggingFace repo URLs or import and download models from .gguf urls on huggingface.

ModelData (used by ModelDB):

  • Represents metadata and info about a specific model.
  • Used by ModelDB to track and load models.
  • Can be initialized from URL, file, or ModelDB search.
  • Used by ModelDB to download model gguf file

Installation

To install the package use pip

pip install glai

Usage:

Usage examples.

Import package

from glai import AutoAI, EasyAI, ModelDB, ModelData

AutoAI - automatic model loading

ai = AutoAI(name_search="Mistral")
ai.generate("Hello") 

EasyAI - manual model configuration

easy = EasyAI()
easy.load_model_db()
easy.find_model_data(name_search="Mistral")
easy.load_ai()
easy.generate("Hello")

ModelDB - search models and show db info

from llgg import ModelDB
db = ModelDB()
model = db.find_model(name_search="Mistral")
print(model.name)
db.show_db_info()

Import Models from Repo

Import models from a HuggingFace repo into the model database:

from glai.back_end.model_db.db import ModelDB

mdb = ModelDB('./gguf_db', False)
mdb.import_models_from_repo(
    hf_repo_url="https://huggingface.co/TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF",
    user_tags=["[INST]", "[/INST]"],
    ai_tags=["", ""],
    description="We introduce SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.",
    keywords=["10.7B", "upstage","isntruct", "solar"],
    replace_existing=False,
)
mdb.show_db_info()

AutoAI Quick Example

Quickly generate using AutoAI:

from glai.ai import AutoAI

auto_ai = AutoAI("zephyr", "q2_k", new_tokens=50, max_input_tokens=100)
auto_ai.generate(
    user_message_text="Output just 'hi' in single quotes with no other prose. Do not include any additional information nor comments.",
    ai_message_to_be_continued= "'",
    stop_at="'",
    include_stop_str=True
)

EasyAI Step By Step Example

Step by step generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.load_model_db('./gguf_db')
easy_ai.find_model_data("zephyr", "q2_k")
easy_ai.load_ai()
easy_ai.generate(
    "Output a list of 3 strings. The first string should be `hi`, the second string should be `there`, and the third string should be `!`.",
    "['",
    "']"
)

EasyAI All In One Example

All in one generation with EasyAI:

from glai.ai import EasyAI

easy_ai = EasyAI()
easy_ai.configure(
    model_db_dir="./gguf_db",
    name_search="zephyr",
    quantization_search="q2_k",
    new_tokens=50,
    max_input_tokens=100
)
easy_ai.generate(
    "Output a python list of 3 unique cat names.", 
    "['", 
    "']"
)

AutoAI from Dict Example

Generate from AutoAI using a config dict:

from glai.ai import AutoAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100  
}

AutoAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['", 
  "]", 
  True
)

EasyAI from Dict Example

Generate from EasyAI using a config dict:

from glai.ai import EasyAI

conf = {
  "model_db_dir": "./gguf_db",
  "name_search": "zephyr",
  "quantization_search": "q2_k",
  "keyword_search": None,
  "new_tokens": 50,
  "max_input_tokens": 100
}

EasyAI(**conf).generate(
  "Please output only the provided message as python list.\nMessage:`This string`.",
  "['",
  "']",
  True  
)

EasyAI from URL Example

Get a model from a URL and generate:

from glai.back_end.model_db.db import ModelDB
from glai.ai import EasyAI

mdb = ModelDB('./gguf_db', False)
mdb.show_db_info()

eai = EasyAI()
eai.load_model_db('./gguf_db')
eai.model_data_from_url(
    url="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/blob/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf",
    user_tags=("[INST]", "[/INST]"),
    ai_tags=("", ""),
    description="The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks we tested.",
    keywords=["mixtral", "8x7b", "instruct", "v0.1", "MoE"],
    save=True,
)
eai.load_ai()
eai.generate(
    user_message="Write a short joke that's actually super funny hilarious best joke.",
    ai_response_content_tbc="",
    stop_at=None,
    include_stop_str=True,
)

Detailed API documentation can be found here: https://laelhalawani.github.io/glai/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glai-0.0.17.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

glai-0.0.17-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file glai-0.0.17.tar.gz.

File metadata

  • Download URL: glai-0.0.17.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.17.tar.gz
Algorithm Hash digest
SHA256 1b5728317209882e664070562d484ece60291b6906ed3f0845af0a09aa00c622
MD5 afd35d013c0f9541cc780ff667250563
BLAKE2b-256 b883b3a4f321ec98f895211392da6788e2aedf7cdf0c3234338d6fdadd049d52

See more details on using hashes here.

File details

Details for the file glai-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: glai-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for glai-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 a8cbf9f8510df16d498ca0741a239bfa0b45d7f24281fccc8a6145019882b605
MD5 4f115ff46064ba96b6496afdceac2d57
BLAKE2b-256 039be1eb33eff471bbe1cc3c0916874d17c28fec98570714265bd46f46c490bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page