Skip to main content

Simple text analysis from the command line

Project description

Simple text analysis from the command line.

Homepage: http://learntextvis.github.io/textkit/

About

textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.

Think of textkit as basic natural language processing capabilities - from the command line.

textkit Features

Here are some of the cool things you can do with textkit.

Convert a document to a set of word tokens and remove all punctuation from the tokens:

textkit text2words input.txt | textkit filterpunc

Count top used words in a text:

textkit text2words alice.txt | textkit count --limit 20

Do the same, but with punctuation removed:

textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20

Installation

$ pip install -U textkit
$ textkit --help

Dev install

To test locally, clone the repo:

git clone git@github.com:learntextvis/textkit.git

Create a local virtual environment or conda environment.

Here is how I created my local conda environment for installing and testing textkit:

conda create -name textkit nltk

source activate textkit

Then I went into the textkit directory to install its requirements

cd textkit

pip install -r requirements.txt

Finally, I installed the local version of textkit using the –editable flag:

pip install --editable .

Examples

See more examples at the Quickstart guide.

Requirements

  • Python >= 2.6 or >= 3.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textkit-0.0.5.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

textkit-0.0.5-py2.py3-none-any.whl (6.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file textkit-0.0.5.tar.gz.

File metadata

  • Download URL: textkit-0.0.5.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for textkit-0.0.5.tar.gz
Algorithm Hash digest
SHA256 47d89dbe359c4d25f2b633b818fbb38ecb4d7eeebf775c8e0cda8d4c54f94f7c
MD5 85ce6e22d39770028733563a1d60e280
BLAKE2b-256 1872a1ab39525b48d061bc98eb71a64628024423065b97db8ee4dce21969538c

See more details on using hashes here.

File details

Details for the file textkit-0.0.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for textkit-0.0.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 29e1460e22f95a3f238a1590bb940d8d885ac6ecc38d76180a1511896a4810d4
MD5 b64213c12f8773df68ab007c125eb5d3
BLAKE2b-256 67d06b6946459b3c235d606c8af4dd303a5103655ce30cf467fd295647ac3765

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page