Cloud embeddings via AINative API — drop-in fastembed replacement, no model downloads
Project description
fastembed-cloud
Cloud text embeddings via AINative API. Drop-in replacement for fastembed — same interface, no model downloads, no ONNX runtime.
Why?
fastembed is great, but requires downloading 100MB+ ONNX models locally. fastembed-cloud gives you the same API backed by a free cloud service — zero setup, zero downloads.
| fastembed | fastembed-cloud | |
|---|---|---|
| First-run latency | 30-120s (model download) | 0s |
| Disk usage | 100MB-1GB per model | 0 |
| ONNX runtime | Required | Not needed |
| Offline support | Yes | No (cloud API) |
| Cost | Free (local compute) | Free (AINative API) |
Install
pip install fastembed-cloud
Quick Start
from fastembed_cloud import CloudTextEmbedding
# Auto-provisions a free API key on first run
model = CloudTextEmbedding()
# Single query
embedding = model.query_embed("What is semantic search?")
print(f"Dimensions: {len(embedding)}") # 384
# Batch embedding
docs = ["First document", "Second document", "Third document"]
embeddings = model.embed(docs)
print(f"Embedded {len(embeddings)} documents")
Models
| Model | Dimensions | ID |
|---|---|---|
| BGE-small-en-v1.5 | 384 | BAAI/bge-small-en-v1.5 (default) |
| BGE-base-en-v1.5 | 768 | BAAI/bge-base-en-v1.5 |
| BGE-large-en-v1.5 | 1024 | BAAI/bge-large-en-v1.5 |
| BGE-M3 (multilingual) | 1024 | bge-m3 |
model = CloudTextEmbedding(model_name="bge-m3")
embedding = model.query_embed("Multilingual embedding")
print(len(embedding)) # 1024
Smart Hybrid: Local + Cloud
TextEmbedding automatically uses local fastembed if installed, cloud otherwise:
from fastembed_cloud import TextEmbedding
model = TextEmbedding()
print(model.is_cloud) # True if fastembed not installed
embeddings = model.embed(["works either way"])
Install fastembed alongside for local-first with cloud fallback:
pip install fastembed-cloud[local]
Authentication
Credentials are resolved in this order:
api_keyparameter:CloudTextEmbedding(api_key="your-key")AINATIVE_API_KEYenvironment variableZERODB_API_KEYenvironment variable (shared with ZeroDB ecosystem)~/.zerodb/credentials.json(auto-saved from any ZeroDB tool)- Auto-provisioning (free 72-hour account, claim to keep permanently)
Auto-Provisioning
On first use with no credentials, fastembed-cloud automatically provisions a free account:
$ python -c "from fastembed_cloud import CloudTextEmbedding; CloudTextEmbedding().query_embed('test')"
No API key found — provisioning a free AINative account for embeddings...
Auto-provisioned! Free embeddings API ready.
API Key: zdb_abc12345...
Expires: 72 hours
Saved to: ~/.zerodb/credentials.json
To keep access permanently, claim your account:
https://ainative.studio/signup
Batch Embedding
Handles large datasets efficiently with automatic batching:
model = CloudTextEmbedding(batch_size=100)
# Automatically batches into chunks of 100
large_dataset = ["document " + str(i) for i in range(10000)]
embeddings = model.embed(large_dataset)
Use with Vector Databases
Qdrant
from qdrant_client import QdrantClient
from fastembed_cloud import CloudTextEmbedding
client = QdrantClient(":memory:")
model = CloudTextEmbedding()
docs = ["AI is transforming healthcare", "Machine learning for finance"]
embeddings = model.embed(docs)
client.add(
collection_name="my_docs",
documents=docs,
embeddings=embeddings,
)
ChromaDB
import chromadb
from fastembed_cloud import CloudTextEmbedding
client = chromadb.Client()
collection = client.create_collection("my_docs")
model = CloudTextEmbedding()
docs = ["First doc", "Second doc"]
embeddings = model.embed(docs)
collection.add(
documents=docs,
embeddings=embeddings,
ids=["id1", "id2"],
)
API Reference
CloudTextEmbedding
Always uses the cloud API.
CloudTextEmbedding(
model_name="BAAI/bge-small-en-v1.5", # Model to use
api_key=None, # API key (auto-resolved)
base_url=None, # API URL (default: api.ainative.studio)
batch_size=64, # Max texts per API call
normalize=True, # Normalize to unit vectors
)
Methods:
embed(documents, batch_size=None)— Embed a list of texts. Returnslist[list[float]].query_embed(query)— Embed a single query. Returnslist[float].passage_embed(texts)— Alias forembed()(fastembed compatibility).
Properties:
dim— Embedding dimensions (e.g., 384 for bge-small).model_name— Resolved model name.
TextEmbedding
Smart hybrid: local fastembed if available, cloud otherwise.
TextEmbedding(
model_name="BAAI/bge-small-en-v1.5",
api_key=None,
**kwargs,
)
Properties:
is_cloud—Trueif using cloud API,Falseif using local fastembed.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastembed_cloud-0.1.0.tar.gz.
File metadata
- Download URL: fastembed_cloud-0.1.0.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70266a7eb609869ad975d690867a9ac6f9a6712d937a63002388f5c44143c56a
|
|
| MD5 |
332d5cacc4970e8da82c7896e67f2faf
|
|
| BLAKE2b-256 |
1a98c15b1766c1db7c9de49fd4f0b22982eda1227b95407adba7b5e13860530e
|
File details
Details for the file fastembed_cloud-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fastembed_cloud-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f3e8e7d000fc3ce06fbb5ce6f9f17d3f837756e3dd739a0a95b3ca64313d5d7
|
|
| MD5 |
7cc53098ae203bbe18de32451d231031
|
|
| BLAKE2b-256 |
f5bf0dc05ed039cd107387855044df449e466057f2cfce4cb007cbe2ace655ad
|