Skip to main content

LangChain 速率控制向量化工具,支持批处理、自动降速、向量缓存

Project description

ratelimited-embedder

LangChain 速率控制向量化工具,支持批处理、自动降速、向量缓存、硬件建议。

安装

pip install ratelimited-embedder

开发模式:

git clone <repo>
cd ratelimited-embedder
pip install -e ".[dev]"

快速上手

from langchain_ollama import OllamaEmbeddings
from ratelimited_embedder import RateControlledEmbedder

embeddings = OllamaEmbeddings(model="qwen3-embedding:0.6b")

# 直接在 RateControlledEmbedder 上配置缓存
embedder = RateControlledEmbedder(
    embeddings=embeddings,
    batch_size=16,
    delay=0.5,
    slow_threshold=2.0,
    cache_dir="./cache",
)

from langchain_core.documents import Document
chunks = [Document(page_content=f"文档片段 {i}") for i in range(100)]

vectorstore = embedder.build_vectorstore(chunks, save_path="faiss_index")

或者单独使用缓存包装器:

from langchain_ollama import OllamaEmbeddings
from ratelimited_embedder import wrap_embeddings

embeddings = OllamaEmbeddings(model="qwen3-embedding:0.6b")
# 方式 1: 指定缓存目录(文件名自动生成为 vector_cache.db)
wrapped = wrap_embeddings(embeddings, cache_dir="./cache")
# 方式 2: 指定完整文件路径
wrapped = wrap_embeddings(embeddings, cache_path="vector_cache.db")

功能

  • 速率控制 — 分批向量化,可配置 batch_size / delay
  • 自动降速 — 单批耗时超过阈值时自动减半 batch_size、增大 delay
  • 向量缓存 — SQLite + MD5,避免重复计算
  • 硬件建议 — 根据内存/CPU 自动推荐参数
  • 进度条 — tqdm 实时显示进度
  • 流式预览 — 每批完成后输出统计

API

RateControlledEmbedder

RateControlledEmbedder(
    embeddings,
    batch_size=16,
    delay=0.5,
    slow_threshold=2.0,
    cache=None,
    cache_path=None,
    cache_dir=None,
)
  • embeddings: LangChain Embeddings 实例
  • batch_size: 每批处理块数
  • delay: 批次间等待秒数
  • slow_threshold: 单批耗时超过此值触发自动降速
  • cache: 传入已创建的 VectorCache 实例(优先级最高)
  • cache_path: SQLite 文件完整路径
  • cache_dir: 缓存目录,文件名自动生成为 vector_cache.db

方法:

  • build_vectorstore(chunks, save_path, progress_callback) → FAISS 向量库
  • get_stats() → dict(统计信息:total_chunks, degrade_count, avg_batch_time 等)
  • set_rate(batch_size, delay) — 动态调整速率
  • get_rate_suggestion() → dict(静态方法,获取硬件建议)

wrap_embeddings

wrap_embeddings(embeddings, cache_path=None, cache_dir=None, cache_class=None)
  • embeddings: 任意 LangChain Embeddings 实例
  • cache_path: SQLite 文件完整路径(优先级高于 cache_dir)
  • cache_dir: 缓存目录,文件名自动生成为 vector_cache.db
  • cache_class: 自定义缓存类

包装 LangChain Embeddings,自动启用 SQLite 向量缓存,对外接口不变。

VectorCache

VectorCache(db_path="vector_cache.db", cache_dir=None)
  • db_path: SQLite 文件完整路径(优先级高于 cache_dir)
  • cache_dir: 缓存目录,文件名自动生成为 vector_cache.db

方法:

  • get(text) → list[float] | None — 查询单条缓存
  • put(text, vector, metadata) — 写入单条缓存
  • get_batch(texts) → (results, miss_indices) — 批量查询
  • put_batch(texts, vectors, metadata) — 批量写入
  • stats() → dict — 缓存统计
  • clear() — 清空缓存

get_hardware_suggestion

from ratelimited_embedder import get_hardware_suggestion
info = get_hardware_suggestion()
# {'batch_size': 32, 'delay': 0.3, 'mem_gb': 16.0, 'mem_percent': 45, 'cpu_count': 8}

根据本机内存和 CPU 核心数推荐 batch_size 和 delay 参数。

类型定义

from ratelimited_embedder import EmbeddingsProtocol, ProgressCallback
  • EmbeddingsProtocol: LangChain Embeddings 接口协议,用于类型标注
  • ProgressCallback: 进度回调类型 Callable[[str], None]

License

Copyright (c) 2026 oi-star

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

A copy of the license is also available in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ratelimited_embedder-0.1.1.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ratelimited_embedder-0.1.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file ratelimited_embedder-0.1.1.tar.gz.

File metadata

  • Download URL: ratelimited_embedder-0.1.1.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for ratelimited_embedder-0.1.1.tar.gz
Algorithm Hash digest
SHA256 07e6e4008c616d85e3218b88c40f46b1b43d9fbab6d4aa671c49df26f59efc1c
MD5 f3a0fe9774247401dd5cefa8d1104795
BLAKE2b-256 6126258088cfe26361bfecd232cb7c8b5bed934c666bd8e00f1314433c61ccc5

See more details on using hashes here.

File details

Details for the file ratelimited_embedder-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ratelimited_embedder-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a10eb925e4ac4e557be358044fc14076e972502e0ebf2eb4a3eb2b33b3d19b6b
MD5 97fb1e59244113979bebefbb43dfd01c
BLAKE2b-256 afdb8020c89b66f002022d7c5e556b8375e40e694792c9a282c54dda47c695bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page