Skip to main content

boodle oodle noodle poodle doodle moo

Project description

Python package

Python package

Text Processing Functions

This project includes several text processing functions in tokenize_text.py:

  • clean_text(input_text): Converts text to lowercase and removes punctuation.
  • tokenize(input_text): Splits text into individual words.
  • count_words(input_text): Counts the occurrences of each word in the text.
  • count_lines(filename): Counts the number of lines in a file.
  • count_total_lines(filenames): Counts the total number of lines across multiple files.
  • count_total_words(filenames): Counts the total number of words across multiple files.
  • count_raven_occurrences(filename): Counts occurrences of the word 'raven' (case insensitive) in a file.

Example Usage

Here's a simple example of how to use the clean_text and count_words functions:

from tokenize_text import clean_text, count_words

text = "The Raven, by Edgar Allan Poe"
cleaned_text = clean_text(text)
word_counts = count_words(text)

print(f"Cleaned text: {cleaned_text}")
print(f"Word counts: {word_counts}")

# Output:
# Cleaned text: the raven by edgar allan poe
# Word counts: {'the': 1, 'raven': 1, 'by': 1, 'edgar': 1, 'allan': 1, 'poe': 1}


### Using `make all`

To automate the setup, text download, statistics generation, and testing processes, you can use the `make all` command. This command will:

1. Set up the Python virtual environment and install the required dependencies.
2. Download the specified text files from Project Gutenberg.
3. Generate statistics about the downloaded texts, including line and word counts for "The Raven" and total counts across all downloaded texts.
4. Run the test suite to ensure all functionalities are working correctly.

To use `make all`, simply run the following command in your terminal:

```bash
make all

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qje5vf-0.0.1.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qje5vf-0.0.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file qje5vf-0.0.1.tar.gz.

File metadata

  • Download URL: qje5vf-0.0.1.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for qje5vf-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bac8c3a417f172222d987fad7150853eb71a78e6dfb7a45e5c40642b8b2c96e5
MD5 4c18229aa41a37c7d390b0ed051c842a
BLAKE2b-256 cb323c8bcbea8394a6be5ce4e3d96494b0545554ff1810d2e80142d4502667c1

See more details on using hashes here.

File details

Details for the file qje5vf-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: qje5vf-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for qje5vf-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d39d7cdfef0602ed6f3a5527e8fd858b44ae00394459a7d5cb985b2628b9c026
MD5 252ac0c17913d43a7fe46d88acd87334
BLAKE2b-256 7be571d788f582adaaf17b476b90b2be12ffae422b51d3c99328bfd5b3edab87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page