Skip to main content

CLI tool for that outputs the N (N by default 100) most common n-word (n by default is 3) sequence in text.

Project description

Word Count

CLI tool for that outputs the N (N by default 100) most common n-word (n by default is 3) sequence in text, along with a count of how many times each occurred in the text.

The CLI can get the text on 'stdin' with default params:

>$ cat text_file.txt | word-count
...

Or using positional arguments:

>$ word-count --files text_file.txt
...
>$ word-count --files text_file.txt --number-of-words 4 --top 5
...

Important

  • It is not case sensitive (e.g. “I love\nsandwiches.” is treated the same as "(I LOVE SANDWICHES!!)")
  • When more than 1 file are passed as argument, each file is processed independently but the series of words are counted together.

How to install

Install the cli using pip

>$ pip intall words-count
...

Then, it will be available to use:

>$ words-count --help
usage: words-count [-h] [-f [FILES ...]] [-n NUMBER_OF_WORDS] [-t TOP]

CLI tool for that outputs the N (N by default 100) most common n-word (n by default is 3) sequence in text, along with a count of how many times each
occurred in the text.

optional arguments:
  -h, --help            show this help message and exit
  -f [FILES ...], --files [FILES ...]
                        Files path to read
  -n NUMBER_OF_WORDS, --number-of-words NUMBER_OF_WORDS
                        Number of words to group
  -t TOP, --top TOP     Max number of groups of words to output

Examples of use

Process to stdin:

>$ cat pg2009.txt | words-count
{
    "of the same": 320,
    "the same species": 126,
    "conditions of life": 125,
    "in the same": 116,
    "of natural selection": 107,
    "from each other": 103,
    "species of the": 98,
    "on the other": 89,
    "the other hand": 81,
    "the case of": 78,
    "the theory of": 75,
...

Use arguments for adjusting the options:

words-count --files pg2009.txt --number-of-words 6 --top 5
{
    "the individuals of the same species": 31,
    "the species of the same genus": 19,
    "we can understand how it is": 13,
    "can understand how it is that": 13,
    "the project gutenberg literary archive foundation": 13
}

Process multiple files:

words-count --files pg2009.txt pg2009.txt --number-of-words 6 --top 5
{
    "the individuals of the same species": 62,
    "the species of the same genus": 38,
    "we can understand how it is": 26,
    "can understand how it is that": 26,
    "the project gutenberg literary archive foundation": 26
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

words-count-1.0.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

words_count-1.0.0-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file words-count-1.0.0.tar.gz.

File metadata

  • Download URL: words-count-1.0.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for words-count-1.0.0.tar.gz
Algorithm Hash digest
SHA256 3e51aa158aa9d5d988b8ab144bf3e60b4099880aeecdf32b0d500344153b159f
MD5 789ee5636eb14577216d4389fa834859
BLAKE2b-256 655e593d0c6b792ed28565bb37aecd70606ac98f1a33e8e0511e40a0d44c0a1f

See more details on using hashes here.

File details

Details for the file words_count-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: words_count-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.1

File hashes

Hashes for words_count-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7199c3224a8a737b8c23cef2d046afce401602a47457d30caf95ca0a345fbc12
MD5 d9ce03c54b2ebf336b2f3ffbdd5023a1
BLAKE2b-256 0d9ba0197beea844fbc192b75f74567b5989a4de1bd96e4a31f9e31be7750b52

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page