Wrapper for simplified use of Llama2 GGUF quantized models.
Project description
gguf_llama
Provides a LlamaAI class with Python interface for generating text using Llama models.
Features
- Load Llama models and tokenizers automatically from gguf file
- Generate text completions for prompts
- Automatically adjust model size to fit longer prompts up to a specific limit
- Convenient methods for tokenizing and untokenizing text
- Fix text formatting issues before generating
Usage
from llama_ai import LlamaAI
ai = LlamaAI("my_model.gguf", max_tokens=500, max_input_tokens=100)"
Generate text by calling infer():
text = ai.infer("Once upon a time")
print(text)"
Installation
pip install gguf_llama
Documentation
See the API documentation for full details on classes and methods.
Contributing
Contributions are welcome! Open an issue or PR to improve gguf_llama.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gguf_llama-0.0.18.tar.gz
(4.6 kB
view hashes)
Built Distribution
Close
Hashes for gguf_llama-0.0.18-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77f0ae941b1cf4766c25b5071ad718ceab8a31abb0dd28e852227c76785e8596 |
|
MD5 | 608e42ce2bfce8ad00b4e21967f59e2f |
|
BLAKE2b-256 | 07bd9a12ace7bdfc65d458c0d9a19c6394172bd187e779dcb865bb70364120c1 |