Skip to main content

Vocabulary management for NLP in Python.

Project description

Documentation Status https://travis-ci.org/vzhong/vocab.svg?branch=master

Vocab is a python package that provides vocabulary objects for natural language processing.

Installation

pip install vocab
pip install git+https://github.com/vzhong/vocab.git

Usage

>>> from vocab import Vocab, UnkVocab
>>> v = Vocab()
>>> v.word2index('hello', train=True)
0
>>> v.word2index(['hello', 'world'], train=True)
[0, 1]
>>> v.index2word([1, 0])
['world', 'hello']
>>> v.index2word(1)
'world'
>>> small = v.prune_by_count(2)
>>> small.to_dict()
{'counts': {'hello': 2}, 'index2word': ['hello']}
>>> u = UnkVocab()
>>> u.word2index(['hello', 'world'], train=True)
[1, 2]
>>> u.word2index('hello friend !'.split())
[1, 0, 0]
>>> u.index2word(0)
'<unk>'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocab-0.0.5.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

vocab-0.0.5-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file vocab-0.0.5.tar.gz.

File metadata

  • Download URL: vocab-0.0.5.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for vocab-0.0.5.tar.gz
Algorithm Hash digest
SHA256 cab92e20c13f964c9c1d319267fbdbe523754e80b17ad7b93ce46bce9089e06b
MD5 0d80a787b92e125d45a6e2336adc9286
BLAKE2b-256 a5abd0a7c3dffef6146a3d09796ee195153241f61c054909b0f4169faa670913

See more details on using hashes here.

File details

Details for the file vocab-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: vocab-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for vocab-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0bba440212204f0427576434264605c118cfea4274342ef1c03a2620014bbe33
MD5 1e659dd112ec325a6bca3a9ba3e52b64
BLAKE2b-256 a263c3f14ca498f1f811eaf8f2c6817e1bdc9724cd6294183066cd43fa124aaf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page