Skip to main content

Simple text analysis from the command line

Project description

Simple text analysis from the command line.

Homepage: http://learntextvis.github.io/textkit/

About

textkit is a series of small, unix-style tools that provide a suite of capabilities for dealing with text as data.

Think of textkit as basic natural language processing capabilities - from the command line.

textkit Features

Here are some of the cool things you can do with textkit.

Convert a document to a set of word tokens and remove all punctuation from the tokens:

textkit text2words input.txt | textkit filterpunc

Count top used words in a text:

textkit text2words alice.txt | textkit count --limit 20

Do the same, but with punctuation removed:

textkit text2words alice.txt | textkit filterpunc | textkit count --limit 20

Installation

$ pip install -U textkit
$ textkit --help

Dev install

To test locally, clone the repo:

git clone git@github.com:learntextvis/textkit.git

Create a local virtual environment or conda environment.

Here is how I created my local conda environment for installing and testing textkit:

conda create -name textkit nltk

source activate textkit

Then I went into the textkit directory to install its requirements

cd textkit

pip install -r requirements.txt

Finally, I installed the local version of textkit using the –editable flag:

pip install --editable .

Examples

See more examples at the Quickstart guide.

Requirements

  • Python >= 2.6 or >= 3.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textkit-0.1.0.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

textkit-0.1.0-py2.py3-none-any.whl (7.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file textkit-0.1.0.tar.gz.

File metadata

  • Download URL: textkit-0.1.0.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for textkit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a6318a58fe1df9071da2eb5fa939e17cd4050bff10f177bd55df47f7ca131ce5
MD5 e3d3d7e3baab56bc28c7de0e913ac8b1
BLAKE2b-256 6a9c7f64dcab6b881b2e9b150446cb130585b183cc65957b042bdf604f4372da

See more details on using hashes here.

File details

Details for the file textkit-0.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for textkit-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 47918d4771e5887430f9b19a045624282179e9f844e0cf104d5105c5348ee158
MD5 3aae53cddb7b2571ddcb3cb5baeeeeee
BLAKE2b-256 c63f1a26d05a80a1f5821b27f0d0c9468395def58abf2594550ee67376915097

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page