Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Simple text analysis from the command line

Project description

Simple text analysis from the command line.

Homepage: http://learntextvis.github.io/textkit/

About

textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.

Think of textkit as basic natural language processing capabilities - from the command line.

textkit Features

Here are some of the cool things you can do with textkit.

Convert a document to a set of word tokens and remove all punctuation from the tokens:

textkit text2words input.txt | textkit filterpunc

Count top used words in a text:

textkit text2words alice.txt | textkit count --limit 20

Do the same, but with punctuation removed:

textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20

Installation

$ pip install -U textkit
$ textkit --help

Dev install

To test locally, clone the repo:

git clone git@github.com:learntextvis/textkit.git

Create a local virtual environment or conda environment.

Here is how I created my local conda environment for installing and testing textkit:

conda create --name textkit nltk

source activate textkit

Then I went into the textkit directory to install its requirements

cd textkit

pip install -r requirements.txt

Finally, I installed the local version of textkit using the --editable flag:

pip install --editable .

Examples

See more examples at the Quickstart guide.

Requirements

  • Python >= 2.6 or >= 3.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for textkit, version 0.2.3
Filename, size File type Python version Upload date Hashes
Filename, size textkit-0.2.3-py2.py3-none-any.whl (20.6 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size textkit-0.2.3.tar.gz (11.6 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page