A tokenizer focused on Spanish language.
Project description
IAR Tokenizer
The IAR (Iván Arias RodrÃguez) Tokenizer is a tokenizer developed mainly for Spanish. It is able to divide a text in paragraphs, those in sentences, and each sentence in a list of tokens.
More information to be added in the future...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
iar_tokenizer-1.0.5.tar.gz
(11.2 kB
view hashes)
Built Distribution
Close
Hashes for iar_tokenizer-1.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9448cf68ceaaa477f4d9d2d4548b14c334aafb3b1fd472e055f6599a8c02a5ce |
|
MD5 | c882cb85c554d04642f75e8b81bde1a3 |
|
BLAKE2b-256 | de19414047d25dff6e155ca2fef9fb6bffce9c2de6899cc1136e03c6084e277b |