Python interface engine and graph definition language for GGUF-based models.
Project description
ggraph: Python Inference for GGML Models
ggraph provides a Python interface and Graph Definition Language for running GGML-based model inference. It enables running and experimenting with GGML models (such as Llama, Qwen, etc.) directly from Python, while also enabling easy distribution of models via a KV entry in a model's GGUF file.
Features
- Custom Graph Definition Language for defining GGML Models
- Load and run GGUF models from Python
Installation
Install the package from PyPI:
pip install ggraph
You may also need to install system dependencies for the GGML backend (see llama.cpp for details).
Usage Example
Below is a minimal example of running inference with a GGML model and a HuggingFace tokenizer:
from ggraph.inference_engine import GGMLInferenceEngine
gguf_path = "/path/to/model.gguf" # Path to your GGML model file
n_ctx = 256
inference_engine = GGMLInferenceEngine(gguf_path, n_ctx=n_ctx, n_threads=12)
result = inference_engine.generate(input_conversation=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of England?"},
])
print(result.generated_text)
CLI
You can also run supported GGUF Models directly from the command line by using the ggraph-cli command. The usage and an example invocation is below:
usage: ggraph-cli [-h] --model MODEL [--debug] [--backend {cpu,cuda,rocm}] [--n_ctx N_CTX] [--n_threads N_THREADS]
[--n_predict N_PREDICT] [--stream] [--temperature TEMPERATURE] [--top_k TOP_K] [--top_p TOP_P]
[--conversation-system CONVERSATION_SYSTEM] [--conversation-user CONVERSATION_USER] [--interactive]
ggraph-cli -m "./models/Qwen2.5-0.5B-Instruct-Q6_K.gguf" --n_ctx 1024 --n_predict 128 --interactive --stream
NOTE: Not all of the CLI flags work as of right now. They are reserved for future use.
GGML Python Bindings
ggraph uses a custom set of bindings generated directly from the GGML/Llama.cpp source code using a modified custom fork of ctypeslib that uses clang to generate the bindings. That fork has then been further modified to generate the wrapper for this project. The modified clang2py can be found here: https://github.com/acon96/ctypeslib-ggml
Project Structure
ggraph/- Core Python packageinference_engine.py- Main inference logicsharded_inference_engine.py- Sharded inference supportmodels/- Model graph definitions and utilitieswrapper/- Low-level bindings to GGML C/C++ libraries
scripts/- Helper scripts for configuration and binding generation
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ggraph-0.0.1.tar.gz.
File metadata
- Download URL: ggraph-0.0.1.tar.gz
- Upload date:
- Size: 70.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8f361c0a91bbeac226034ab26af77fa7ee0d0b374717a40ec7697010dcdf2bd
|
|
| MD5 |
535f47dd55f3a16ffe0549e00a4877bc
|
|
| BLAKE2b-256 |
2a138c0c4726ea5c626fe2a93db50b8918698daede6afe946ad8c29ed42f719a
|
Provenance
The following attestation bundles were made for ggraph-0.0.1.tar.gz:
Publisher:
publish-release.yml on acon96/ggraph.py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggraph-0.0.1.tar.gz -
Subject digest:
a8f361c0a91bbeac226034ab26af77fa7ee0d0b374717a40ec7697010dcdf2bd - Sigstore transparency entry: 232355564
- Sigstore integration time:
-
Permalink:
acon96/ggraph.py@71aef51e87f980775b0561c8972385fdc442d1ae -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/acon96
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-release.yml@71aef51e87f980775b0561c8972385fdc442d1ae -
Trigger Event:
push
-
Statement type: