Skip to main content

An customized pytorch version of pre-trained model.

Project description

torchKbert

  • Our customized version of bert for pytorch

说明

这是笔者基于 Meelfy 的 pytorch_pretrained_BERT 库进行部分定制化修改的模型库。

本项目的初衷是为了满足个人实验的方便,因此不会经常更新。

功能

使用

  • 安装:

    pip install torchKbert
    
  • 典型的使用例子请参考官方 examples 目录。

  • 若想使用层次分解位置编码,使 BERT 可以处理长文本,在 model 中传入参数 is_hierarchical=True 即可。示例如下:

    model = BertModel(config)
    encoder_outputs, _ = model(input_ids, token_ids, input_mask, is_hierarchical=True)
    
  • 若想使用基于词颗粒度的中文WoBERT,只需在构建BertTokenizer对象时传入新参数:

    from torchKbert.tokenization import BertTokenizer
    
    tokenizer = BertTokenizer(
        vocab_file=vocab_path, 
        pre_tokenizer=lambda s: jieba.cut(s, HMM=False))
    

    不传入时,默认为None。分词时默认以词为单位,若想恢复使用以字为单位,只需在tokenize时传入新参数pre_tokenize=False

    tokenzier.tokenize(text, pre_tokenize=False)
    

背景

之前一直在用 Meelfy 编写的 pytorch_pretrained_BERT,调用预训练模型或进行微调已经十分方便。后来因个人的需求,所以就想改写一个支持层次分解位置编码的版本。

苏神的 bert4keras 已经实现了这样的功能。但因个人惯于使用 pytorch,已经很久不用 keras 了,所以才打算自己改写一个。

更新

  • 2021.03.07 : 添加层次分解位置编码。
  • 2021.05.27 : 添加基于词颗粒度的中文WoBERT。
  • 2022.03.27 : 参照 pytorch_transformers 对 BertPretrainedModel 代码实现进行了重构。

参考

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchKbert-1.1.3.tar.gz (71.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torchKbert-1.1.3-py3-none-any.whl (88.4 kB view details)

Uploaded Python 3

File details

Details for the file torchKbert-1.1.3.tar.gz.

File metadata

  • Download URL: torchKbert-1.1.3.tar.gz
  • Upload date:
  • Size: 71.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/58.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for torchKbert-1.1.3.tar.gz
Algorithm Hash digest
SHA256 454e0221d3277d8116e2f70f3bdc18b5780562386663460c2b4509bf2f6e0d77
MD5 a85f9469c617097c83d41494d65ed9f2
BLAKE2b-256 b525ea8bd2e973f893e5946a672b676120222d42adf88540475b87505d389a06

See more details on using hashes here.

File details

Details for the file torchKbert-1.1.3-py3-none-any.whl.

File metadata

  • Download URL: torchKbert-1.1.3-py3-none-any.whl
  • Upload date:
  • Size: 88.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/58.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for torchKbert-1.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a26610eea4e9a906a56023d2bc928ec7da62ed488240bf9a3db30bd4d69db2fd
MD5 cefa3bafc1409c9f46e20c8615fe16ca
BLAKE2b-256 8b13fb89e679a1ec720e1852aa246ff398bc5e08579b72ea36a3eb7a06e09887

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page