LlamaIndex embeddings for Forge — Voxell's text-embedding API (turbo/pro/ultra; ultra = Qwen3-Embedding-8B, ~75+ avg MTEB, #4 English).
Project description
llama-index-embeddings-forge
LlamaIndex embeddings for Forge — Voxell's hosted text-embedding API.
Why Forge
One API, three tiers — pick your point on the quality/cost curve:
| Model | Dim | Notes |
|---|---|---|
turbo |
1024 | fast, low cost |
pro |
2560 | |
ultra |
4096 | Qwen3-Embedding-8B; ~75+ avg task score on MTEB, currently #4 on MTEB (English) — the top usable model (the three above are research-only) |
Matryoshka (MRL) dimensions are real: truncated vectors are re-normalized, so a shorter dim is a
unit-norm prefix of the full vector — smaller index, minimal quality loss. Forge logs request
metadata only (model, tokens, latency) — never your text or vectors.
Install
pip install llama-index-embeddings-forge
Usage
from llama_index.embeddings.forge import ForgeEmbedding
# FORGE_API_KEY is read from the environment; or pass api_key=...
embed_model = ForgeEmbedding(model="turbo")
vector = embed_model.get_text_embedding("the quick brown fox")
query = embed_model.get_query_embedding("fast animal")
batch = embed_model.get_text_embedding_batch(["doc one", "doc two"])
As the global embed model
from llama_index.core import Settings, VectorStoreIndex, Document
from llama_index.embeddings.forge import ForgeEmbedding
Settings.embed_model = ForgeEmbedding(model="pro")
index = VectorStoreIndex.from_documents([Document(text="hello world")])
Async
vec = await embed_model.aget_text_embedding("doc")
q = await embed_model.aget_query_embedding("a search query")
Matryoshka (shorter vectors)
embed_model = ForgeEmbedding(model="turbo", dimensions=256) # re-normalized 256-d vectors
Configuration
| Arg | Default | Notes |
|---|---|---|
model |
"turbo" |
turbo | pro | ultra (stored as model_name) |
api_key |
FORGE_API_KEY env |
get one at dash.voxell.ai |
base_url |
https://api.voxell.ai |
|
dimensions |
None |
Matryoshka truncation, e.g. 256 |
timeout |
30.0 |
seconds |
License
MIT © Voxell, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_index_embeddings_forge-0.1.0.tar.gz.
File metadata
- Download URL: llama_index_embeddings_forge-0.1.0.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e2c2439e4001349a3ce6d938989df7dc80114ec6c14c959d1ef9359f231d540
|
|
| MD5 |
845ef6cb8390d429c3f87734a6391fd9
|
|
| BLAKE2b-256 |
3df92a8d05f49482e455255a051d2e1b87c7f3eb739b3486e80c12b9966bf4b5
|
File details
Details for the file llama_index_embeddings_forge-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llama_index_embeddings_forge-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
616d25b7210bcb91063fd92f08e5bb7e2d56937b9d7beb8ed9a1ae80ce17145b
|
|
| MD5 |
abd18c8b2fd7d93d79edd57888fbf1a5
|
|
| BLAKE2b-256 |
4d0124ead5ae34073fa789c5372163936d7e27f83be8eaf413f25067f4f26aad
|