Skip to main content

LlamaIndex embeddings for Forge — Voxell's text-embedding API (turbo/pro/ultra; ultra = Qwen3-Embedding-8B, ~75+ avg MTEB, #4 English).

Project description

llama-index-embeddings-forge

LlamaIndex embeddings for Forge — Voxell's hosted text-embedding API.

Why Forge

One API, three tiers — pick your point on the quality/cost curve:

Model Dim Notes
turbo 1024 fast, low cost
pro 2560
ultra 4096 Qwen3-Embedding-8B; ~75+ avg task score on MTEB, currently #4 on MTEB (English) — the top usable model (the three above are research-only)

Matryoshka (MRL) dimensions are real: truncated vectors are re-normalized, so a shorter dim is a unit-norm prefix of the full vector — smaller index, minimal quality loss. Forge logs request metadata only (model, tokens, latency) — never your text or vectors.

Install

pip install llama-index-embeddings-forge

Usage

from llama_index.embeddings.forge import ForgeEmbedding

# FORGE_API_KEY is read from the environment; or pass api_key=...
embed_model = ForgeEmbedding(model="turbo")

vector = embed_model.get_text_embedding("the quick brown fox")
query = embed_model.get_query_embedding("fast animal")
batch = embed_model.get_text_embedding_batch(["doc one", "doc two"])

As the global embed model

from llama_index.core import Settings, VectorStoreIndex, Document
from llama_index.embeddings.forge import ForgeEmbedding

Settings.embed_model = ForgeEmbedding(model="pro")
index = VectorStoreIndex.from_documents([Document(text="hello world")])

Async

vec = await embed_model.aget_text_embedding("doc")
q = await embed_model.aget_query_embedding("a search query")

Matryoshka (shorter vectors)

embed_model = ForgeEmbedding(model="turbo", dimensions=256)  # re-normalized 256-d vectors

Configuration

Arg Default Notes
model "turbo" turbo | pro | ultra (stored as model_name)
api_key FORGE_API_KEY env get one at dash.voxell.ai
base_url https://api.voxell.ai
dimensions None Matryoshka truncation, e.g. 256
timeout 30.0 seconds

License

MIT © Voxell, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_embeddings_forge-0.1.0.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_embeddings_forge-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_embeddings_forge-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_embeddings_forge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7e2c2439e4001349a3ce6d938989df7dc80114ec6c14c959d1ef9359f231d540
MD5 845ef6cb8390d429c3f87734a6391fd9
BLAKE2b-256 3df92a8d05f49482e455255a051d2e1b87c7f3eb739b3486e80c12b9966bf4b5

See more details on using hashes here.

File details

Details for the file llama_index_embeddings_forge-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_embeddings_forge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 616d25b7210bcb91063fd92f08e5bb7e2d56937b9d7beb8ed9a1ae80ce17145b
MD5 abd18c8b2fd7d93d79edd57888fbf1a5
BLAKE2b-256 4d0124ead5ae34073fa789c5372163936d7e27f83be8eaf413f25067f4f26aad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page