Simple text analysis from the command line
Project description
Simple text analysis from the command line.
Homepage: http://learntextvis.github.io/textkit/
About
textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.
Think of textkit as basic natural language processing capabilities - from the command line.
textkit Features
Here are some of the cool things you can do with textkit.
Convert a document to a set of word tokens and remove all punctuation from the tokens:
textkit text2words input.txt | textkit filterpunc
Count top used words in a text:
textkit text2words alice.txt | textkit count --limit 20
Do the same, but with punctuation removed:
textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20
Installation
$ pip install -U textkit $ textkit --help
Dev install
To test locally, clone the repo:
git clone git@github.com:learntextvis/textkit.git
Create a local virtual environment or conda environment.
Here is how I created my local conda environment for installing and testing textkit:
conda create --name textkit nltk source activate textkit
Then I went into the textkit directory to install its requirements
cd textkit pip install -r requirements.txt
Finally, I installed the local version of textkit using the –editable flag:
pip install --editable .
Examples
See more examples at the Quickstart guide.
Requirements
Python >= 2.6 or >= 3.3
Project Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file textkit-0.2.3.tar.gz
.
File metadata
- Download URL: textkit-0.2.3.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab7d0c0f0809959ea3aaf9bcd629a8c40c8886451688fe480c64abc0a1cd2345 |
|
MD5 | 16955800aac3e4701cdbe8f84f49310b |
|
BLAKE2b-256 | eee6815f92435c124813f2864a116944d682187d35aa062ab12b661ba7550789 |
File details
Details for the file textkit-0.2.3-py2.py3-none-any.whl
.
File metadata
- Download URL: textkit-0.2.3-py2.py3-none-any.whl
- Upload date:
- Size: 20.6 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2395afeaae7b99a87b57af8a79d9463dbef3908b76d3958ca43f50aaac0a656e |
|
MD5 | f1f24bdd32e23767f1ff1799b21402eb |
|
BLAKE2b-256 | 1d16ab1d10c152affc6221146a83fcd237218eb75dbb7b819dac16d46b3a2422 |