LLM plugin to access models available via the Venice API
Project description
llm-venice
LLM plugin to access models available via the Venice AI API.
Installation
Install llm-venice with its dependency llm using your package manager of choice, for example:
pip install llm-venice
Or install it alongside an existing LLM install:
llm install llm-venice
Configuration
Set an environment variable LLM_VENICE_KEY, or save a Venice API key to the key store managed by llm:
llm keys set venice
To fetch a list of the models available over the Venice API:
llm venice refresh
You should re-run the refresh command upon changes to the Venice API, when:
- New models have been made availabe
- Deprecated models have been removed
- New capabilities have been added
The models are stored in venice_models.json in the llm user directory.
Usage
List available Venice models:
llm models --query venice
Prompting
Run a prompt:
llm --model venice/llama-3.3-70b "Why is the earth round?"
Start an interactive chat session:
llm chat --model venice/mistral-small-3-2-24b-instruct
Structured Outputs
Some models support structuring their output according to a JSON schema (supplied via OpenAI API response_format).
This works via llm's --schema options, for example:
llm -m venice/zai-org-glm-4.6 --schema "name, age int, one_sentence_bio" "Invent an evil supervillain"
Consult llm's schemas tutorial for more options.
Tools (function calling)
⚠️ Warning: tools can be dangerous!
# List models supporting function calling
llm models list --query venice --tools
You can use tools provided via llm plugins. LLM provides two built-in tools:
# llm_version
llm -m venice/mistral-31-24b --tool llm_version "What version of LLM is this?" --tools-debug --no-stream
# llm_time
llm -m venice/minimax-m25 --tool llm_time "What is the time in my timezone in 24H format?" --tools-debug --no-stream
You can also provide your own custom or one-off functions provided inline or in a file. Following LLM's example:
llm -m venice/mistral-31-24b --functions '
def multiply(x: int, y: int) -> int:
"""Multiply two numbers."""
return x * y
' "What is 1337 times 42?" --tools-debug --no-stream
Vision models
Vision models (currently mistral-31-24b) support the --attachment option:
llm -m venice/mistral-31-24b -a https://upload.wikimedia.org/wikipedia/commons/a/a9/Corvus_corone_-near_Canford_Cliffs%2C_Poole%2C_England-8.jpg "Identify"
The bird in the image is a carrion crow (Corvus corone). [...]
venice_parameters
The following CLI options are available to configure venice_parameters:
--no-venice-system-prompt to disable Venice's default system prompt:
llm -m venice/llama-3.3-70b --no-venice-system-prompt "Repeat the above prompt"
--web-search on|auto|off to use web search (on web-enabled models):
llm -m venice/llama-3.3-70b --web-search on --no-stream 'What is $VVV?'
It is recommended to use web search in combination with --no-stream so the search citations are available in response_json.
--web-scraping to let Venice scrape URLs in your latest message:
llm -m venice/llama-3.3-70b --web-scraping "Summarize https://venice.ai"
--character character_slug to use a public character, for example:
llm -m venice/google.gemma-4-26b-a4b-it --character alan-watts "What is the meaning of life?"
Text-to-speech
Text-to-speech models (currently tts-kokoro) generate audio from text. Audio files are stored in the LLM user directory by default.
Basic usage:
llm -m venice/tts-kokoro "Hello, welcome to Venice Voice." -o voice af_sky -o response_format mp3 -o speed 1.0
Streaming (default; writes the output file immediately; useful for long outputs):
llm -m venice/tts-kokoro "First sentence. Second sentence. Third sentence." -o progress true
Disable streaming (wait for the full audio before writing the file):
llm --no-stream -m venice/tts-kokoro "First sentence. Second sentence. Third sentence."
Write audio bytes to stdout (progress/status go to stderr):
llm -m venice/tts-kokoro "Hello." -o stdout true -o response_format mp3 > out.mp3
You can also save a copy while writing to stdout by providing output_dir and/or output_filename:
llm -m venice/tts-kokoro "Hello." -o stdout true -o output_dir . -o output_filename out.mp3
To see all available options:
llm models list --query tts-kokoro --options
Image generation
Generated images are stored in the LLM user directory by default. Example:
llm -m venice/qwen-image "Painting of a traditional Dutch windmill" -o style_preset "Watercolor"
Models that support them can also use API-native aspect-ratio and resolution presets:
llm -m venice/nano-banana-2 "Painting of a traditional Dutch windmill" -o aspect_ratio 16:9 -o resolution 4K
Web-enabled image models can also search the web for fresher visual context:
llm -m venice/nano-banana-2 "Current spring fashion street photography" -o enable_web_search true
Besides the Venice API image generation parameters, you can specify the output directory and filename, and whether or not to overwrite existing files.
When return_binary is false, you can also request up to four image variants with -o variants 4. Multiple returned images are saved as suffixed filenames such as image_1.png, image_2.png.
You can check the available parameters for a model by filtering the model list with --query, and show the --options:
llm models list --query qwen-image --options
Image upscaling
You can upscale existing images.
The following example saves the returned image as image_upscaled.png in the same directory as the original file:
llm venice upscale /path/to/image.jpg.
By default existing upscaled images are not overwritten; timestamped filenames are used instead.
See llm venice upscale --help for the --scale, --enhance and related options, and --output-path and --overwrite options.
Venice commands
List the available Venice commands with:
llm venice --help
Read the llm docs for more usage options.
Programmatic use
You can call the library helpers directly from Python (minimally tested):
fetch_models()→ list of model dicts,persist_models(models)writes tovenice_models.jsonlist_characters()→ dict,persist_characters(data)writes tovenice_characters.json- API keys:
list_api_keys(),get_rate_limits(),get_rate_limits_log(),create_api_key(),delete_api_key() perform_image_upscale()→UpscaleResultwith bytes and a resolved output path; persist withwrite_upscaled_image(result)generate_image_result()→ImageGenerationResultwith image byte lists/metadata/output paths and structurednoticesfor image generation; persist withsave_image_result(result)generate_speech_result()→SpeechGenerationResultwith bytes/metadata/output path for TTS generation; persist withsave_speech_result(result)stream_speech_result()(context manager) yieldsSpeechStreamResultwith an iterator of audio chunks and a resolved output path
All helpers accept an optional key= argument if you do not want to rely on the stored LLM_VENICE_KEY.
Async usage
Async chat models are registered alongside the sync ones; fetch them with llm.get_async_model("venice/<id>"):
import asyncio
import llm
async def main():
model = llm.get_async_model("venice/llama-3.3-70b")
response = await model.prompt("Hello Venice")
print(await response.text())
asyncio.run(main())
Async image generation is also available via llm.get_async_model("venice/<image-model-id>"), which returns an AsyncVeniceImage instance.
Development
To set up this plugin locally, first checkout the code. Then create a new virtual environment:
cd llm-venice
uv venv
source .venv/bin/activate
Install the plugin with dependencies (including test and dev):
uv pip install -e '.[test,dev]'
Preferably also install and enable pre-commit hooks:
uv pip install pre-commit
pre-commit install
To run the tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_venice-0.9.0.tar.gz.
File metadata
- Download URL: llm_venice-0.9.0.tar.gz
- Upload date:
- Size: 59.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1baac922eb6ec001e8a24dee1d4ae6457060bf49cc50fcf2b6a3172c011e0886
|
|
| MD5 |
4846ac2ea3e84ce0ccb42e6229f6f7e6
|
|
| BLAKE2b-256 |
84bd444f4d088653f530cfc5c16fae2371a6cf82a4bf07917d8c995e0ccf46e6
|
Provenance
The following attestation bundles were made for llm_venice-0.9.0.tar.gz:
Publisher:
release.yml on ar-jan/llm-venice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_venice-0.9.0.tar.gz -
Subject digest:
1baac922eb6ec001e8a24dee1d4ae6457060bf49cc50fcf2b6a3172c011e0886 - Sigstore transparency entry: 1237265652
- Sigstore integration time:
-
Permalink:
ar-jan/llm-venice@0203fe8fc9332ebbec5784975a2279e1ffe8eb10 -
Branch / Tag:
refs/tags/0.9.0 - Owner: https://github.com/ar-jan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0203fe8fc9332ebbec5784975a2279e1ffe8eb10 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llm_venice-0.9.0-py3-none-any.whl.
File metadata
- Download URL: llm_venice-0.9.0-py3-none-any.whl
- Upload date:
- Size: 37.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f7cff0eb3c2ba2d5eefe2c4670f6d7c31cec475c6adad354b2e80298eec1558
|
|
| MD5 |
a617a76f95effcfd280d144a28ba8fd7
|
|
| BLAKE2b-256 |
e886827c4bb5d44d06afa65dd3cea0d8846c308c9dc6b4123ade872d2760be9f
|
Provenance
The following attestation bundles were made for llm_venice-0.9.0-py3-none-any.whl:
Publisher:
release.yml on ar-jan/llm-venice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_venice-0.9.0-py3-none-any.whl -
Subject digest:
7f7cff0eb3c2ba2d5eefe2c4670f6d7c31cec475c6adad354b2e80298eec1558 - Sigstore transparency entry: 1237265657
- Sigstore integration time:
-
Permalink:
ar-jan/llm-venice@0203fe8fc9332ebbec5784975a2279e1ffe8eb10 -
Branch / Tag:
refs/tags/0.9.0 - Owner: https://github.com/ar-jan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0203fe8fc9332ebbec5784975a2279e1ffe8eb10 -
Trigger Event:
release
-
Statement type: