Generate per-session LoRA adapters for inference tasks
Project description
Tessera Hypernetwork
Generate per-session LoRA adapters for inference tasks. This is the Python hypernetwork service component of Tessera, which works alongside the Rust core to provide LoRA adapter generation via hypernetwork synthesis.
Features
- Doc-to-LoRA with SHINE: Generate adapters from document content using SHINE (ICML 2026) for long-context internalization
- Text-to-LoRA: Generate adapters from natural language descriptions
- Metadata-to-LoRA: Generate adapters from structured user metadata
- LoRAX-style Adapter Management: Import, list, and unload adapters via CLI and API
- OpenAI-compatible Completions:
/v1/completionsendpoint for lm_eval integration - OpenAI-compatible API: Easy integration with existing tooling
- FastAPI: Modern async Python web framework
Installation
pip install tessera-hypernetwork
Usage
LoRAX Adapter Management
The hypernetwork service provides LoRAX-style adapter management for loading and serving LoRA adapters:
# Import an adapter into the service
tessera lorax import --path ./adapter.safetensors --name my-adapter --base-model meta-llama/Llama-3-8B --server-url http://localhost:8000
# List loaded adapters
tessera lorax list --server-url http://localhost:8000
# Unload an adapter
tessera lorax unload --name my-adapter --server-url http://localhost:8000
API Endpoints:
POST /v1/adapters- Import adapter safetensorsGET /v1/adapters- List loaded adaptersDELETE /v1/adapters/{name}- Unload adapter
OpenAI-Compatible Completions
The /v1/completions endpoint provides OpenAI-compatible completions for lm_eval integration:
curl -X POST http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "my-adapter",
"prompt": "Hello, world!",
"max_tokens": 10
}'
The endpoint looks up the adapter by name from the loaded adapters registry and forwards the request to vLLM.
CLI Commands
The tessera CLI provides commands for generating LoRA adapters and running the hypernetwork server:
# Generate LoRA adapter from metadata
tessera generate --from-metadata '{"task": "classification", "domain": "medical"}' \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Generate LoRA adapter from text description
tessera generate --from-text "Senior litigation associate specializing in IP law" \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Generate LoRA adapter from document
tessera generate --from-doc ./document.txt \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Start the hypernetwork server
tessera serve --port 8080 --host 0.0.0.0
# Start server with Qdrant vector database
tessera serve --port 8080 --qdrant-url http://localhost:6333
# Check server health
tessera health --url http://localhost:8000
# List available base models
tessera list
Server Mode
You can also run the server directly:
python -m tessera_hypernetwork.server
API
The hypernetwork service provides a FastAPI server with the following endpoints:
POST /v1/generate- Generate a LoRA adapter for a given promptGET /health- Health check endpoint
Development
Install development dependencies:
pip install tessera-hypernetwork[dev]
Run tests:
pytest
Integration with Tessera
This hypernetwork service is designed to work with the Tessera Rust core. The Rust core handles semantic caching, vector similarity search, and adapter composition, while this Python service handles the actual LoRA adapter generation via hypernetwork synthesis.
Full Tessera CLI Lifecycle
The Tessera hypernetwork service provides a comprehensive CLI for LoRA adapter generation and serving:
# Generate LoRA adapter from metadata (JSON string or file)
tessera generate --from-metadata '{"task": "classification", "domain": "medical"}' \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Generate LoRA adapter from natural language description
tessera generate --from-text "Senior litigation associate specializing in IP law" \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Generate LoRA adapter from document content
tessera generate --from-doc ./document.txt \
--base-model meta-llama/Llama-3-8B \
--rank 16 \
--save ./adapter.safetensors
# Start the hypernetwork server
tessera serve --port 8080 --host 0.0.0.0
# Start server with Qdrant vector database integration
tessera serve --port 8080 --qdrant-url http://localhost:6333
# Check server health status
tessera health --url http://localhost:8000
# List available base models and their dimensions
tessera list
# LoRAX adapter management
tessera lorax import --path ./adapter.safetensors --name my-adapter --base-model meta-llama/Llama-3-8B --server-url http://localhost:8000
tessera lorax list --server-url http://localhost:8000
tessera lorax unload --name my-adapter --server-url http://localhost:8000
For the complete Tessera system, see: https://github.com/theoddden/Tessera
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tessera_hypernetwork-0.2.15.tar.gz.
File metadata
- Download URL: tessera_hypernetwork-0.2.15.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
803ff5d68e1bbf52ecedda30acfd657d048fb727e676c7c616f77a975307dcc3
|
|
| MD5 |
cdb98399cc93c185eddd97f30283615b
|
|
| BLAKE2b-256 |
f28b779d3b00fca8bb09993bc005147b988df1c7a3c470bfb95cb63ec3807127
|
File details
Details for the file tessera_hypernetwork-0.2.15-py3-none-any.whl.
File metadata
- Download URL: tessera_hypernetwork-0.2.15-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81737202691f50cd1377561553a64f8bb1ed2d123d6db01cc75ab5a440b5c94d
|
|
| MD5 |
17911113b5984db00bd446f0416a3278
|
|
| BLAKE2b-256 |
deb3b7878b40e0d69b11b025ab78cb5a07b6bdf7cce9204ef2071cf32feeb437
|