Skip to main content

No project description provided

Project description

llama-api-server

Code style: black Release

This project is under active deployment. Breaking changes could be made any time.

Llama as a Service! This project try to build a REST-ful API server compatible to OpenAI API using open source backends like llama.

Get started

Prepare model

llama.cpp

If you you don't have quantize llama, you need to follow instruction to prepare model.

Install

pip install llama-api-server
echo > config.yml << EOF
models:
  completions:
    text-davinci-003:
      type: llama_cpp
      params:
        path: /absolute/path/to/your/7B/ggml-model-q4_0.bin
  embeddings:
    text-embedding-ada-002:
      type: llama_cpp
      params:
        path: /absolute/path/to/your/7B/ggml-model-q4_0.bin
EOF
python -m python -m llama_api_server

Call with openai-python

export OPENAI_API_BASE=http://127.0.0.1:5000/v1
openai api completions.create -e text-davinci-003 -p "hello?"

Roadmap

Tested with

  • openai-python
    • OPENAI_API_TYPE=default
    • OPENAI_API_TYPE=azure

Supported APIs

  • Completions
    • set temperature, top\_p, and top\_k
    • set max\_tokens
    • set stop
    • set stream
    • set n
    • set presence\_penalty and frequency\_penalty
    • set logit\_bias
  • Embeddings
    • batch process
  • Chat

Supported backed

Others

  • Documents
  • Token auth
  • Intergration tests
  • Performance parameters like n_batch and n_thread
  • A tool to download/prepare pretrain model

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_api_server-0.1.2.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

llama_api_server-0.1.2-py3-none-any.whl (5.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page