Skip to main content

A hybrid chatbot - LLM side.

Project description

WAFL-llm

WAFL is built to run as a two-part system. Both can be installed on the same machine. This is the LLM side of the WAFL project.

The two parts of WAFL

LLM side (needs a GPU)

This is a model server for the speech-to-text model, the LLM, the embedding system, and the text-to-speech model.

Installation

In order to quickly run the LLM side, you can use the following installation commands:

pip install wafl-llm
wafl-llm start

which will use the default models and start the server on port 8080.

The installation will require MPI and Java installed on the system. One can install both with the following commands

sudo apt install libopenmpi-dev
sudo apt install default-jdk

Configuration

A use-case specific configuration can be set by creating a config.json file in the path where wafl-llm start is executed. The file should look like this (the default configuration)

{
  "llm_model": "fractalego/wafl-phi3-mini-4k",
  "speaker_model": "facebook/fastspeech2-en-ljspeech",
  "whisper_model": "fractalego/personal-whisper-distilled-model",
  "sentence_embedder_models": "TaylorAI/gte-tiny",
  "device": "cuda",
  "quantization": false
}

The models are downloaded from the HugggingFace repository. Any other compatible model should work. The device can be set to cuda or cpu. The quantization can be set to true or false and uses the vLLM fp8 option. This will only work if your GPU supports it. Each model can be set to null to deactivate it (and save memory). This is useful if you only want to use the LLM model, for example.

Docker

Run the following

docker/build.sh
docker run -p8080:8080 --env NVIDIA_DISABLE_REQUIRE=1 --gpus all wafl-llm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wafl_llm-0.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

wafl_llm-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file wafl_llm-0.1.0.tar.gz.

File metadata

  • Download URL: wafl_llm-0.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.10.0 requests/2.31.0 setuptools/52.0.0 requests-toolbelt/1.0.0 tqdm/4.64.1 CPython/3.9.2

File hashes

Hashes for wafl_llm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5f39aaa8ca32c8b275cb6239911802557238c6bcc11525c2129917908f901afc
MD5 2054a55f0b6288fb96e36cbb8d16d0a5
BLAKE2b-256 1e3ee796cb18b21582fe67c5ec486c090fab0671e227bee47cdeb4ff89a43482

See more details on using hashes here.

File details

Details for the file wafl_llm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: wafl_llm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.10.0 requests/2.31.0 setuptools/52.0.0 requests-toolbelt/1.0.0 tqdm/4.64.1 CPython/3.9.2

File hashes

Hashes for wafl_llm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77d6208c0d02797448f68c6df0604cca613ba9c83f6cf6fe2a16175e715e9613
MD5 d92b1d208e3559c5124b29b5cf1f0e6f
BLAKE2b-256 3b06cc1d65d56503047f4b7bd221af46a9df6f98bfaa1ac79f8e4b7965fa88b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page