Skip to main content

Simple text analysis from the command line

Project description

Simple text analysis from the command line.

Homepage: http://learntextvis.github.io/textkit/

About

textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.

Think of textkit as basic natural language processing capabilities - from the command line.

textkit Features

Here are some of the cool things you can do with textkit.

Convert a document to a set of word tokens and remove all punctuation from the tokens:

textkit text2words input.txt | textkit filterpunc

Count top used words in a text:

textkit text2words alice.txt | textkit count --limit 20

Do the same, but with punctuation removed:

textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20

Installation

$ pip install -U textkit
$ textkit --help

Dev install

To test locally, clone the repo:

git clone git@github.com:learntextvis/textkit.git

Create a local virtual environment or conda environment.

Here is how I created my local conda environment for installing and testing textkit:

conda create --name textkit nltk

source activate textkit

Then I went into the textkit directory to install its requirements

cd textkit

pip install -r requirements.txt

Finally, I installed the local version of textkit using the –editable flag:

pip install --editable .

Examples

See more examples at the Quickstart guide.

Requirements

  • Python >= 2.6 or >= 3.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textkit-0.2.0.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

textkit-0.2.0-py2.py3-none-any.whl (19.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file textkit-0.2.0.tar.gz.

File metadata

  • Download URL: textkit-0.2.0.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for textkit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2fa5519ff697c3a2d6bef7351e97d7ed57fd8f60e72e40c9cdd01716ce2177f4
MD5 7e5fe0143353cf62ffbc797200db2cc4
BLAKE2b-256 8093e5f5f33d0cbd4e345ce8ac13b8cb2d8b77902bb7e6654e3234ec0cde5b88

See more details on using hashes here.

File details

Details for the file textkit-0.2.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for textkit-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c7361bcbe7f5de5bf8514dcc88020caefc818e54e18fce985ef7c3bb980b93e9
MD5 a6c87aee095015376ea71ef38d942771
BLAKE2b-256 f5631372fb402a3570e486f50004ba6124b9d1dbb68cada3e9181e5ce0e9f31f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page