Easy deployment of quantized llama models on cpu
Project description
glai - GGUF LLAMA AI - Package for simplified text generation with Llama models quantized to GGUF format is loaded.
Provides high level APIs for loading models and generating text completions.
High Level API Classes:
AutoAI:
- Automatically searches for and loads a model based on name/quantization/keyword.
- Handles downloading model data, loading it to memory, and configuring message formatting.
- Use generate() method to get completions by providing a user message.
EasyAI:
- Allows manually configuring model data source - from file, URL, or ModelDB search.
- Handles downloading model data, loading it to memory, and configuring message formatting.
- Use generate() method to get completions by providing a user message.
ModelDB (used by AutoAI and EasyAI):
- Manages database of model data files. via ModelData class objects.
- Useful for searching for models and retrieving model metadata.
- Can import models from HuggingFace repo URLs or import and download models from .gguf urls on huggingface.
ModelData (used by ModelDB):
- Represents metadata and info about a specific model.
- Used by ModelDB to track and load models.
- Can be initialized from URL, file, or ModelDB search.
- Used by ModelDB to download model gguf file
Usage:
Import package
from glai import AutoAI, EasyAI, ModelDB, ModelData
AutoAI - automatic model loading
ai = AutoAI(name_search="Mistral")
ai.generate("Hello")
EasyAI - manual model configuration
easy = EasyAI()
easy.load_model_db()
easy.find_model_data(name_search="Mistral")
easy.load_ai()
easy.generate("Hello")
ModelDB - search models and show db info
from llgg import ModelDB
db = ModelDB()
model = db.find_model(name_search="Mistral")
print(model.name)
db.show_db_info()
GGUF Examples Tutorial
This tutorial provides examples for using the GGUF package.
Import Models from Repo
Import models from a HuggingFace repo into the model database:
from glai.back_end.model_db.db import ModelDB
mdb = ModelDB('./gguf_db', False)
mdb.import_models_from_repo(
hf_repo_url="https://huggingface.co/TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF",
user_tags=["[INST]", "[/INST]"],
ai_tags=["", ""],
description="We introduce SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.",
keywords=["10.7B", "upstage","isntruct", "solar"],
replace_existing=False,
)
mdb.show_db_info()
AutoAI Quick Example
Quickly generate using AutoAI:
from glai.ai import AutoAI
auto_ai = AutoAI("zephyr", "q2_k", new_tokens=50, max_input_tokens=100)
auto_ai.generate(
user_message_text="Output just 'hi' in single quotes with no other prose. Do not include any additional information nor comments.",
ai_message_to_be_continued= "'",
stop_at="'",
include_stop_str=True
)
EasyAI Step By Step Example
Step by step generation with EasyAI:
from glai.ai import EasyAI
easy_ai = EasyAI()
easy_ai.load_model_db('./gguf_db')
easy_ai.find_model_data("zephyr", "q2_k")
easy_ai.load_ai()
easy_ai.generate(
"Output a list of 3 strings. The first string should be `hi`, the second string should be `there`, and the third string should be `!`.",
"['",
"']"
)
EasyAI All In One Example
All in one generation with EasyAI:
from glai.ai import EasyAI
easy_ai = EasyAI()
easy_ai.configure(
model_db_dir="./gguf_db",
name_search="zephyr",
quantization_search="q2_k",
new_tokens=50,
max_input_tokens=100
)
easy_ai.generate(
"Output a python list of 3 unique cat names.",
"['",
"']"
)
AutoAI from Dict Example
Generate from AutoAI using a config dict:
from glai.ai import AutoAI
conf = {
"model_db_dir": "./gguf_db",
"name_search": "zephyr",
"quantization_search": "q2_k",
"keyword_search": None,
"new_tokens": 50,
"max_input_tokens": 100
}
AutoAI(**conf).generate(
"Please output only the provided message as python list.\nMessage:`This string`.",
"['",
"]",
True
)
EasyAI from Dict Example
Generate from EasyAI using a config dict:
from glai.ai import EasyAI
conf = {
"model_db_dir": "./gguf_db",
"name_search": "zephyr",
"quantization_search": "q2_k",
"keyword_search": None,
"new_tokens": 50,
"max_input_tokens": 100
}
EasyAI(**conf).generate(
"Please output only the provided message as python list.\nMessage:`This string`.",
"['",
"']",
True
)
EasyAI from URL Example
Get a model from a URL and generate:
from glai.back_end.model_db.db import ModelDB
from glai.ai import EasyAI
mdb = ModelDB('./gguf_db', False)
mdb.show_db_info()
eai = EasyAI()
eai.load_model_db('./gguf_db')
eai.model_data_from_url(
url="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/blob/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf",
user_tags=("[INST]", "[/INST]"),
ai_tags=("", ""),
description="The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B on most benchmarks we tested.",
keywords=["mixtral", "8x7b", "instruct", "v0.1", "MoE"],
save=True,
)
eai.load_ai()
eai.generate(
user_message="Write a short joke that's actually super funny hilarious best joke.",
ai_response_content_tbc="",
stop_at=None,
include_stop_str=True,
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file glai-0.0.11.tar.gz
.
File metadata
- Download URL: glai-0.0.11.tar.gz
- Upload date:
- Size: 24.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9c714f5e9ecc03502394436a4c8780c36759ebff20106df8c0c73879625a130 |
|
MD5 | 3e6368806c74671fac910e19e4129bca |
|
BLAKE2b-256 | 8175bd8caf2c4fb4542aaeed64438fff048addaf049147b0b260fee480cbea49 |
File details
Details for the file glai-0.0.11-py3-none-any.whl
.
File metadata
- Download URL: glai-0.0.11-py3-none-any.whl
- Upload date:
- Size: 33.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc9936c5daa8bc37a068b2be2cf8bac9ff42f30dbd01deef8ffb47c927ea88f4 |
|
MD5 | 83cce7daebe1492ce4613dba2148204f |
|
BLAKE2b-256 | f863957c02e2f59e18be939113fb8200eec6f191054d63fcb45533cd48e95a46 |