Simple text analysis from the command line
Project description
Simple text analysis from the command line.
Homepage: http://learntextvis.github.io/textkit/
About
textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.
Think of textkit as basic natural language processing capabilities - from the command line.
textkit Features
Here are some of the cool things you can do with textkit.
Convert a document to a set of word tokens and remove all punctuation from the tokens:
textkit text2words input.txt | textkit filterpunc
Count top used words in a text:
textkit text2words alice.txt | textkit count --limit 20
Do the same, but with punctuation removed:
textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20
Installation
$ pip install -U textkit $ textkit --help
Dev install
To test locally, clone the repo:
git clone git@github.com:learntextvis/textkit.git
Create a local virtual environment or conda environment.
Here is how I created my local conda environment for installing and testing textkit:
conda create --name textkit nltk source activate textkit
Then I went into the textkit directory to install its requirements
cd textkit pip install -r requirements.txt
Finally, I installed the local version of textkit using the –editable flag:
pip install --editable .
Examples
See more examples at the Quickstart guide.
Requirements
Python >= 2.6 or >= 3.3
Project Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for textkit-0.2.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2395afeaae7b99a87b57af8a79d9463dbef3908b76d3958ca43f50aaac0a656e |
|
MD5 | f1f24bdd32e23767f1ff1799b21402eb |
|
BLAKE2b-256 | 1d16ab1d10c152affc6221146a83fcd237218eb75dbb7b819dac16d46b3a2422 |