AI Handler: An engine which wraps certain huggingface models
Project description
AI Handler
This is a simple framework for running AI models. It makes use of the huggingface API which gives you a queue, threading, a simple API, and the ability to run Stable Diffusion and LLMs seamlessly from your local hardware.
This is not intended to be used as a standalone application.
It can easily be extended and used to power interfaces or it can be run from the command line.
AI Handler is a work in progress. It powers two projects at the moment, but may not be ready for general use.
Installation
This is a work in progress.
Pre-requisites
System requirements
- Windows 10+
- Python 3.10.8
- pip 23.0.1
- CUDA toolkit 11.7
- CUDNN 8.6.0.163
- Cuda capable GPU
- 16gb+ ram
Install
pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --index-url https://download.pytorch.org/whl/cu117
pip install https://github.com/w4ffl35/diffusers/archive/refs/tags/v0.14.0.ckpt_fix.tar.gz
pip install https://github.com/w4ffl35/transformers/archive/refs/tags/tensor_fix-v1.0.2.tar.gz
pip install https://github.com/acpopescu/bitsandbytes/releases/download/v0.37.2-win.0/bitsandbytes-0.37.2-py3-none-any.whl
pip install aihandlerwindows
Optional
These are optional instructions for installing TensorRT and Deepspeed for Windows
Install Tensor RT:
- Download TensorRT-8.4.3.1.Windows10.x86_64.cuda-11.6.cudnn8.4
- Git clone TensorRT 8.4.3.1
- Follow their instructions to build TensorRT-8.4.3.1 python wheel
- Install TensorRT
pip install tensorrt-*.whl
Install Deepspeed:
- Git clone Deepspeed 0.8.1
- Follow their instructions to build Deepspeed python wheel
- Install Deepspeed `pip install deepspeed-*.whl
Environment variables
AIRUNNER_ENVIRONMENT
-dev
orprod
. Defaults todev
. This controls the LOG_LEVELLOG_LEVEL
-FATAL
for production,DEBUG
for development. Override this to force a log level
Huggingface variables
Offline mode
These environment variables keep you offline until you need to download a model. This prevents unwanted online access and speeds up usage of huggingface libraries.
DISABLE_TELEMETRY
Keep this set to 1 at all times. Huggingface collects minimal telemetry when downloading a model from their repository but this will keep it disabled. See more info in this github threadHF_HUB_OFFLINE
When loading a diffusers model, huggingface libraries will attempt to download an updated cache before running the model. This prevents that check from happening (long with a boolean passed toload_pretrained
see the runner.py file for examples)TRANSFORMERS_OFFLINE
Similar toHF_HUB_OFFLINE
but for transformers models
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for aihandlerwindows-1.8.19-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9ef24fd2199a856bf1c32d0da4d203234b88855031265dd51d65911e470e103 |
|
MD5 | 93c491563997f0ad641adcde111cf399 |
|
BLAKE2b-256 | 3b23609d849a94641a554235b235510e7299a467b385e6ab0e0d1e015ce4630f |