A clean interface for interacting with the Lemonade LLM server

These details have not been verified by PyPI

Project links

Project description

🍋 Lemonade Python SDK

A robust, production-grade Python wrapper for the Lemonade C++ Backend.

This SDK provides a clean, pythonic interface for interacting with local LLMs running on Lemonade. It was built to power Sorana (a visual workspace for AI), extracting the core integration logic into a standalone, open-source library for the developer community.

🚀 Key Features

Auto-Discovery: Automatically scans multiple ports and hosts to find active Lemonade instances.
Low-Overhead Architecture: Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
Health Checks & Recovery: Built-in utilities to verify server status and handle connection drops.
Type-Safe Client: Full Python type hinting for better developer experience (IDE autocompletion).
Model Management: Simple API to load, unload, and list models dynamically.
Embeddings API: Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).

📦 Installation

pip install .

Alternatively, you can install it directly from GitHub:

pip install git+[https://github.com/Tetramatrix/lemonade-python-sdk.git](https://github.com/Tetramatrix/lemonade-python-sdk.git)

⚡ Quick Start

1. Connecting to Lemonade

The SDK automatically handles port discovery, so you don't need to hardcode localhost:8000.

from lemonade_integration.client import LemonadeClient
from lemonade_integration.port_scanner import find_available_lemonade_port

# Auto-discover running instance
port = find_available_lemonade_port()
if port:
    client = LemonadeClient(base_url=f"http://localhost:{port}")
    if client.health_check():
        print(f"Connected to Lemonade on port {port}")
else:
    print("No Lemonade instance found.")

2. Chat Completion

response = client.chat_completion(
    model="Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Hello World in C++"}
    ],
    temperature=0.7
)

print(response['choices'][0]['message']['content'])

3. Model Management

# List all available models
models = client.list_models()
for m in models:
    print(f"Found model: {m['id']}")

# Load a specific model into memory
client.load_model("Mistral-7B-v0.1")

4. Embeddings (NEW)

Generate text embeddings for semantic search, RAG pipelines, and clustering.

# List available embedding models (filtered by 'embeddings' label)
embedding_models = client.list_embedding_models()
for model in embedding_models:
    print(f"Embedding model: {model['id']}")

# Generate embeddings for single text
response = client.embeddings(
    input="Hello, world!",
    model="nomic-embed-text-v1-GGUF"
)

embedding_vector = response["data"][0]["embedding"]
print(f"Vector length: {len(embedding_vector)}")

# Generate embeddings for multiple texts
texts = ["Text 1", "Text 2", "Text 3"]
response = client.embeddings(
    input=texts,
    model="nomic-embed-text-v1-GGUF"
)

for item in response["data"]:
    print(f"Text {item['index']}: {len(item['embedding'])} dimensions")

Supported Backends:

✅ FLM (FastFlowLM) - NPU-accelerated on Windows
✅ llamacpp (.GGUF models) - CPU/GPU
❌ ONNX/OGA - Not supported

🖼️ Production Showcase: Sorana

This SDK was extracted from the core engine of Sorana, a professional visual workspace for AI. It demonstrates the SDK's capability to handle complex, real-world requirements on AMD Ryzen AI hardware:

Low Latency: Powers sub-second response times for multi-model chat interfaces.
Dynamic Workflows: Manages the loading and unloading of 20+ different LLMs based on user activity to optimize local NPU/GPU memory.
Zero-Config UX: Uses the built-in port scanner to automatically connect the Sorana frontend to the Lemonade backend without user intervention.

🛠️ Project Structure

client.py: Main entry point for API interactions (chat, embeddings, model management).
port_scanner.py: Utilities for detecting Lemonade instances across ports (8000-9000).
model_discovery.py: Logic for fetching and parsing model metadata.
request_builder.py: Helper functions to construct compliant payloads (chat, embeddings).
utils.py: Additional utility functions.

📚 Documentation

Embeddings API - Complete guide for using embeddings
Lemonade Server Docs - Official Lemonade documentation

🤝 Contributing

Contributions are welcome! This project is intended to help the AMD Ryzen AI and Lemonade community build downstream applications faster.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lemonade_integration-1.0.0.tar.gz (12.8 kB view details)

Uploaded Mar 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lemonade_integration-1.0.0-py3-none-any.whl (12.2 kB view details)

Uploaded Mar 21, 2026 Python 3

File details

Details for the file lemonade_integration-1.0.0.tar.gz.

File metadata

Download URL: lemonade_integration-1.0.0.tar.gz
Upload date: Mar 21, 2026
Size: 12.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for lemonade_integration-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`817b9ad67e1b06c9fa116745ad3a5caeb9dbf862db74a1df8b28d2c51137a023`
MD5	`b098226d0293a4709f3f7bd696b2b94b`
BLAKE2b-256	`40056a69cc0cadd12940d97f99094dcc1ff117079dfd76133cc20c4f441098b2`

See more details on using hashes here.

File details

Details for the file lemonade_integration-1.0.0-py3-none-any.whl.

File metadata

Download URL: lemonade_integration-1.0.0-py3-none-any.whl
Upload date: Mar 21, 2026
Size: 12.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for lemonade_integration-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fdeab6b4443d7666ed2b73c6c1d0f03ed776249877e494262512b63f4a44e252`
MD5	`28743bb5864d7bef3793196ad726a755`
BLAKE2b-256	`7dedb3b74fdb4501227b8ac3f0d195442fdc8df68cfd598812835f7dbd88a69e`

See more details on using hashes here.

lemonade-integration 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🍋 Lemonade Python SDK

🚀 Key Features

📦 Installation

⚡ Quick Start

1. Connecting to Lemonade

2. Chat Completion

3. Model Management

4. Embeddings (NEW)

🖼️ Production Showcase: Sorana

🛠️ Project Structure

📚 Documentation

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes