Skip to main content

Natural Language Processing in Rust with Python bidings

Project description

vtext

This is a Python wrapper for the Rust vtext crate.

This package aims to provide a high performance toolkit for ingesting textual data for machine learning applications.

The API is currently unstable.

Features

  • Tokenization: Regexp tokenizer, Unicode segmentation + language specific rules
  • Stemming: Snowball (in Python 15-20x faster than NLTK)
  • Analyzers (planned): word and character n-grams, skip grams
  • Token counting: converting token counts to sparse matrices for use in machine learning libraries. Similar to CountVectorizer and HashingVectorizer in scikit-learn.
  • Feature weighting (planned): feature weighting based on document frequency (TF-IDF), feature normalization.
  • Levenshtein edit distance; Sørensen-Dice, Jaro, Jaro Winkler string similarities

Installation

vtext requires Python 3.5+ and can be installed with,

pip install --pre vtext

Documentation

Project documentation: vtext.io/doc/latest/index.html

License

vtext is released under the Apache License, Version 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

vtext-0.1.0a2-cp37-cp37m-win_amd64.whl (601.0 kB view details)

Uploaded CPython 3.7m Windows x86-64

vtext-0.1.0a2-cp37-cp37m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m

vtext-0.1.0a2-cp37-cp37m-macosx_10_7_x86_64.whl (670.3 kB view details)

Uploaded CPython 3.7m macOS 10.7+ x86-64

vtext-0.1.0a2-cp36-cp36m-win_amd64.whl (601.0 kB view details)

Uploaded CPython 3.6m Windows x86-64

vtext-0.1.0a2-cp36-cp36m-manylinux1_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.6m

vtext-0.1.0a2-cp36-cp36m-macosx_10_7_x86_64.whl (670.3 kB view details)

Uploaded CPython 3.6m macOS 10.7+ x86-64

vtext-0.1.0a2-cp35-cp35m-win_amd64.whl (600.9 kB view details)

Uploaded CPython 3.5m Windows x86-64

vtext-0.1.0a2-cp35-cp35m-manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.5m

vtext-0.1.0a2-cp35-cp35m-macosx_10_6_x86_64.whl (670.3 kB view details)

Uploaded CPython 3.5m macOS 10.6+ x86-64

File details

Details for the file vtext-0.1.0a2-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 601.0 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 d0a129445fef9fe59fa4d9cb3dc3ce3ae65145ffd3ba339a32d76a6829881cb4
MD5 1029591d8108965d6c888aecf6cc4798
BLAKE2b-256 2a6f2735fbcc09ea2359b3c45b1c9c1ff5dc5bdabaf0d6db1b84757e9757a476

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0d7efe2861ef75c744c4ce7fb12b3c5399f46022d055bb00f92a4006848e638e
MD5 1be61ff5571f1cd8cb2075364572a82b
BLAKE2b-256 44478c3d691b9d4cd05b8a946485149756039722f63373327d34ab3dca739804

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp37-cp37m-macosx_10_7_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp37-cp37m-macosx_10_7_x86_64.whl
  • Upload date:
  • Size: 670.3 kB
  • Tags: CPython 3.7m, macOS 10.7+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 5d30579f54e1af9fd3bc651550546554a843685d5d6ae236d39280b93ebf1de7
MD5 083cbfd479c9096734a62093ac0913e1
BLAKE2b-256 460a38076aeb57d42a486a9de480ec606f7cab0b9dec15672548787925406677

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 601.0 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 0b932842af38ad6112277d6ae0e998835c25e7e03dbf78905728d8ad234c306a
MD5 74c482ecef598f20803f2db51647adee
BLAKE2b-256 1e4988752f8f80eedb1cad10daf83df513a5c0da44bf875e9b3767200a93ac71

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c5d96fef927417d96b7d93e2bfd96b19eb13fbe9f2a92fd5e5c284ea0ab57b97
MD5 00f26a3e49fb566872971dbf8c7ef148
BLAKE2b-256 0d8ae2212f10c90c8fa4379f5cbe3b49097f2b2949e142bb449088cfdc9e1346

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp36-cp36m-macosx_10_7_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp36-cp36m-macosx_10_7_x86_64.whl
  • Upload date:
  • Size: 670.3 kB
  • Tags: CPython 3.6m, macOS 10.7+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 4369d1619dbd7325023fa544eb74e9690aeead6efbf20254b99aa45f4847424f
MD5 1f54d0a8e2fb48feee4ae5d4ea1b9ee7
BLAKE2b-256 1bbe13efa78c0807050cd7553f8ebc73fbf30a4c7933b01c456650002377781f

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 600.9 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 51c1752691cea938bdb1da2c256829f2e168d3605be52bbc9aec635d9daa5192
MD5 b973e0ef445ab7244a2558a0ff1d53d3
BLAKE2b-256 1703a485e81b5a9a2e30e435152ae997af43d7e0e879f4f447617bc39206eda7

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 faa88c73376e8788c827a21656463bb267e17884473c0265b3aaba440314bcce
MD5 b4559af51834512f505732d30a432c7f
BLAKE2b-256 80bdf953e12387ac58211b23b2c8ed06fdbdf00d26335f3812cc651ad9f98295

See more details on using hashes here.

File details

Details for the file vtext-0.1.0a2-cp35-cp35m-macosx_10_6_x86_64.whl.

File metadata

  • Download URL: vtext-0.1.0a2-cp35-cp35m-macosx_10_6_x86_64.whl
  • Upload date:
  • Size: 670.3 kB
  • Tags: CPython 3.5m, macOS 10.6+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for vtext-0.1.0a2-cp35-cp35m-macosx_10_6_x86_64.whl
Algorithm Hash digest
SHA256 5fcc0ce57ef77632c48e63a73d6f1372dfdc75c34932f71755cc73a1edf32525
MD5 efef3fd3820519e663cad75e4c24ed63
BLAKE2b-256 c400033f605e943a4925bb3fb315b78baeb3cf228fc90019f78f14a7cb00f14b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page