Skip to main content

boodle oodle noodle poodle doodle moo

Project description

Python package

Python package

Text Processing Functions

This project includes several text processing functions in tokenize_text.py:

  • clean_text(input_text): Converts text to lowercase and removes punctuation.
  • tokenize(input_text): Splits text into individual words.
  • count_words(input_text): Counts the occurrences of each word in the text.
  • count_lines(filename): Counts the number of lines in a file.
  • count_total_lines(filenames): Counts the total number of lines across multiple files.
  • count_total_words(filenames): Counts the total number of words across multiple files.
  • count_raven_occurrences(filename): Counts occurrences of the word 'raven' (case insensitive) in a file.

Example Usage

Here's a simple example of how to use the clean_text and count_words functions:

from tokenize_text import clean_text, count_words

text = "The Raven, by Edgar Allan Poe"
cleaned_text = clean_text(text)
word_counts = count_words(text)

print(f"Cleaned text: {cleaned_text}")
print(f"Word counts: {word_counts}")

# Output:
# Cleaned text: the raven by edgar allan poe
# Word counts: {'the': 1, 'raven': 1, 'by': 1, 'edgar': 1, 'allan': 1, 'poe': 1}


### Using `make all`

To automate the setup, text download, statistics generation, and testing processes, you can use the `make all` command. This command will:

1. Set up the Python virtual environment and install the required dependencies.
2. Download the specified text files from Project Gutenberg.
3. Generate statistics about the downloaded texts, including line and word counts for "The Raven" and total counts across all downloaded texts.
4. Run the test suite to ensure all functionalities are working correctly.

To use `make all`, simply run the following command in your terminal:

```bash
make all

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qje5vf-0.0.1.tar.gz (13.3 kB view hashes)

Uploaded Source

Built Distribution

qje5vf-0.0.1-py3-none-any.whl (7.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page