Project description

naivenlp

Python package

A naive toolkit for NLP.

A tokenizer is used to tokenize text. It can converts tokens to ids, and convert ids to tokens.

Here are some vocab-based tokenizers, which means theses tokenizers need an vocabulary.

VocabBasedTokenizer, base class for vocab-based tokenizers.
JiebaTokenizer, an wrapper for original fsxjy/jieba
BasicTokenizer and WordpieceTokenizer, from google-research/bert
LanguageModelTokenizer, a tokenizer for language models. Transformer, BERT for example.

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

0.0.9

Jul 22, 2020

0.0.8

Jul 16, 2020

0.0.7

Jul 16, 2020

0.0.6

Jul 16, 2020

0.0.5

Jul 11, 2020

0.0.4

Jul 5, 2020

This version

0.0.3

Jul 1, 2020

0.0.2

Jun 21, 2020

0.0.1

Jun 10, 2020

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Uploaded Jul 1, 2020 Source

Uploaded Jul 1, 2020 Python 3

Hashes for naivenlp-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`acc3cf6c11f1a76a02f331ce33f993f107d90d11d1e415ee563deb18c1e76a3a`
MD5	`8f69f8debe839514a0e98ae267c9c586`
BLAKE2b-256	`fe00f963e85e19299910c4ac1e671477e18d1659254c7a41a9fa8a6c3a845676`