Skip to main content

an elegant bert4vector

Project description

bert4vector

向量计算、存储、检索、相似度计算

licence GitHub release PyPI PyPI - Downloads GitHub stars GitHub Issues contributions welcome

Documentation | Bert4torch | Examples | Source code

1. 下载安装

  • 安装稳定版
pip install bert4vector
  • 安装最新版
pip install git+https://github.com/Tongjilibo/bert4vector

2. 快速使用

from bert4vector import BertVector
model = BertVector('/data/pretrain_ckpt/simbert/sushen@simbert_chinese_tiny')
model.add_corpus(['你好', '我选你', '天气不错', '人很好看'], gpu_index=True)
print(model.search('你好', topk=2))
# {'你好': [{'corpus_id': 0, 'score': 0.9999, 'text': '你好'},
#           {'corpus_id': 3, 'score': 0.5694, 'text': '人很好看'}]} 

"""

3. 支持的句向量权重

模型分类 模型名称 权重来源 权重链接 备注(若有)
simbert simbert 追一科技 Tongjilibo/simbert-chinese-base, Tongjilibo/simbert-chinese-small, Tongjilibo/simbert-chinese-tiny
simbert_v2/roformer-sim 追一科技 junnyu/roformer_chinese_sim_char_basejunnyu/roformer_chinese_sim_char_ft_basejunnyu/roformer_chinese_sim_char_smalljunnyu/roformer_chinese_sim_char_ft_small roformer_chinese_sim_char_base, roformer_chinese_sim_char_ft_base, roformer_chinese_sim_char_small, roformer_chinese_sim_char_ft_small
embedding text2vec-base-chinese shibing624 shibing624/text2vec-base-chinese text2vec-base-chinese
m3e moka-ai moka-ai/m3e-base m3e-base
bge BAAI BAAI/bge-large-en-v1.5, BAAI/bge-large-zh-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-base-zh-v1.5, BAAI/bge-small-en-v1.5, BAAI/bge-small-zh-v1.5 bge-large-en-v1.5, bge-large-zh-v1.5, bge-base-en-v1.5, bge-base-zh-v1.5, bge-small-en-v1.5, bge-small-zh-v1.5
gte thenlper thenlper/gte-large-zh, thenlper/gte-base-zh gte-base-zh, gte-large-zh

4. 版本历史

更新日期 bert4vector 版本说明
20240628 0.0.3 增加多种字面召回,增加api接口部署
20240131 0.0.2.post2 去除对bert4torch的版本依赖
20231228 0.0.2 初始版本,支持内存和faiss模式

5. 更新历史:

  • 20240628:增加多种字面召回,增加api接口部署
  • 20231228:初始版本,支持内存和faiss模式

6. Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bert4vector-0.0.3.tar.gz (28.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page