Skip to main content

An customized pytorch version of pre-trained model.

Project description

torchKbert

  • Our customized version of bert for pytorch

说明

这是笔者基于 Meelfy 的 pytorch_pretrained_BERT 库进行部分定制化修改的模型库。

本项目的初衷是为了满足个人实验的方便,因此不会经常更新。

功能

使用

  • 安装:

    pip install torchKbert
    
  • 典型的使用例子请参考官方 examples 目录。

  • 若想使用层次分解位置编码,使 BERT 可以处理长文本,在 model 中传入参数 is_hierarchical=True 即可。示例如下:

    model = BertModel(config)
    encoder_outputs, _ = model(input_ids, token_ids, input_mask, is_hierarchical=True)
    
  • 若想使用基于词颗粒度的中文WoBERT,只需在构建BertTokenizer对象时传入新参数:

    from torchKbert.tokenization import BertTokenizer
    
    tokenizer = BertTokenizer(
        vocab_file=vocab_path, 
        pre_tokenizer=lambda s: jieba.cut(s, HMM=False))
    

    不传入时,默认为None。分词时默认以词为单位,若想恢复使用以字为单位,只需在tokenize时传入新参数pre_tokenize=False

    tokenzier.tokenize(text, pre_tokenize=False)
    

背景

之前一直在用 Meelfy 编写的 pytorch_pretrained_BERT,调用预训练模型或进行微调已经十分方便。后来因个人的需求,所以就想改写一个支持层次分解位置编码的版本。

苏神的 bert4keras 已经实现了这样的功能。但因个人惯于使用 pytorch,已经很久不用 keras 了,所以才打算自己改写一个。

更新

  • 2021.03.07 : 添加层次分解位置编码。
  • 2021.05.27 : 添加基于词颗粒度的中文WoBERT。

鸣谢

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchKbert-1.1.1.tar.gz (71.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torchKbert-1.1.1-py3-none-any.whl (88.1 kB view details)

Uploaded Python 3

File details

Details for the file torchKbert-1.1.1.tar.gz.

File metadata

  • Download URL: torchKbert-1.1.1.tar.gz
  • Upload date:
  • Size: 71.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for torchKbert-1.1.1.tar.gz
Algorithm Hash digest
SHA256 4d9d45778bde057c422a231591fd6e333941883cebbd8d0889badcf0b6301ece
MD5 0fbc848acd3a54f4f0d7a1bb1779b0e4
BLAKE2b-256 a8867ecc618a36411d18470a98964677034ad25703fe49e3a7a12a65f9c844f9

See more details on using hashes here.

File details

Details for the file torchKbert-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: torchKbert-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 88.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for torchKbert-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 81efa36d5ec0ac3711aafff221a44b1455aa77f68d1502bae087b410efe64a57
MD5 de0bc8c21b61da234e135739094ce389
BLAKE2b-256 7eb5581c0eb051e477c0c0272b97f5f0a52017e454aeae6c26c48805886d37f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page