Skip to main content

Lemonade SDK: Your LLM Aide for Validation and Deployment

Project description

Lemonade tests OS - Windows | Linux Made with Python

🍋 Lemonade SDK: Quickly serve, benchmark and deploy LLMs

The Lemonade SDK makes it easy to run Large Language Models (LLMs) on your PC. Our focus is using the best tools, such as neural processing units (NPUs) and Vulkan GPU acceleration, to maximize LLM speed and responsiveness.

Lemonade Demo

Features

The Lemonade SDK is comprised of the following:

  • 🌐 Lemonade Server: A local LLM server for running ONNX and GGUF models using the OpenAI API standard. Install and enable your applications with NPU and GPU acceleration in minutes.
  • 🐍 Lemonade API: High-level Python API to directly integrate Lemonade LLMs into Python applications.
  • 🖥️ Lemonade CLI: The lemonade CLI lets you mix-and-match LLMs (ONNX, GGUF, SafeTensors) with measurement tools to characterize your models on your hardware. The available tools are:
    • Prompting with templates.
    • Measuring accuracy with a variety of tests.
    • Benchmarking to get the time-to-first-token and tokens per second.
    • Profiling the memory utilization.

Click here to get started with Lemonade.

Supported Configurations

Maximum LLM performance requires the right hardware accelerator with the right inference engine for your scenario. Lemonade supports the following configurations, while also making it easy to switch between them at runtime.

Hardware 🛠️ Engine Support 🖥️ OS (x86/x64)
OGA llamacpp HF Windows Linux
🧠 CPU All platforms All platforms All platforms
🎮 GPU Vulkan: All platforms
Focus:
Ryzen™ AI 7000/8000/300
Radeon™ 7000/9000
🤖 NPU AMD Ryzen™ AI 300 series

Inference Engines Overview

Engine Description
OnnxRuntime GenAI (OGA) Microsoft engine that runs .onnx models and enables hardware vendors to provide their own execution providers (EPs) to support specialized hardware, such as neural processing units (NPUs).
llamacpp Community-driven engine with strong GPU acceleration, support for thousands of .gguf models, and advanced features such as vision-language models (VLMs) and mixture-of-experts (MoEs).
Hugging Face (HF) Hugging Face's transformers library can run the original .safetensors trained weights for models on Meta's PyTorch engine, which provides a source of truth for accuracy measurement.

Integrate Lemonade Server with Your Application

Lemonade Server enables languages including Python, C++, Java, C#, Node.js, Go, Ruby, Rust, and PHP. For the full list and integration details, see docs/server/README.md.

Contributing

We are actively seeking collaborators from across the industry. If you would like to contribute to this project, please check out our contribution guide.

Maintainers

This project is sponsored by AMD. It is maintained by @danielholanda @jeremyfowers @ramkrishna @vgodsoe in equal measure. You can reach us by filing an issue or email lemonade@amd.com.

License

This project is licensed under the Apache 2.0 License. Portions of the project are licensed as described in NOTICE.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lemonade_sdk-8.0.1.tar.gz (210.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lemonade_sdk-8.0.1-py3-none-any.whl (234.4 kB view details)

Uploaded Python 3

File details

Details for the file lemonade_sdk-8.0.1.tar.gz.

File metadata

  • Download URL: lemonade_sdk-8.0.1.tar.gz
  • Upload date:
  • Size: 210.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for lemonade_sdk-8.0.1.tar.gz
Algorithm Hash digest
SHA256 50e6dda47d84e1ff41564f99be9bd3560b52c05d6b44b179f334be50fce99282
MD5 6b362857d9bb54d16966e9f00a45d759
BLAKE2b-256 8f416b222df826a6cb2fec1ecd96e7e2a9eb3d7d6cfcdb85a1d6a50e5ee0d160

See more details on using hashes here.

File details

Details for the file lemonade_sdk-8.0.1-py3-none-any.whl.

File metadata

  • Download URL: lemonade_sdk-8.0.1-py3-none-any.whl
  • Upload date:
  • Size: 234.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for lemonade_sdk-8.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 19d6cb39047208d34a53eeeb4d69c2a509f1ebbb66dd7a407e4c8f35c8e7bc27
MD5 c8b0258e5a218b38b55138023da75f0e
BLAKE2b-256 a5e69dd276fe4a3b553f09f9e3428b1b5b9a2ef980118583e321b83c4c1103eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page