A lightweight Python library for text cleaning and formatting
Project description
TextPrettify
A lightweight Python library for text cleaning and formatting. TextPrettify provides simple, intuitive functions to manipulate and format text strings for common use cases.
Features
- Remove Extra Whitespace: Clean up text by removing leading/trailing spaces and normalizing multiple spaces
- Slugify: Convert text to URL-friendly slugs
- Reading Time Estimation: Calculate estimated reading time for text
- Capitalize Words: Apply title case with customizable exceptions
- Truncate Text: Shorten text to a maximum length with word-boundary awareness
- Remove Punctuation: Strip punctuation with optional character preservation
- Word Count: Count total or unique words in text
Installation
# Clone the repository
git clone https://github.com/yourusername/TextPrettify.git
cd TextPrettify
# Install in development mode
pip install -e .
Quick Start
from textprettify import (
remove_extra_whitespace,
slugify,
get_reading_time,
capitalize_words,
truncate_text,
remove_punctuation,
count_words
)
# Clean up messy whitespace
text = " Hello World "
clean_text = remove_extra_whitespace(text)
print(clean_text) # "Hello World"
# Create URL-friendly slugs
title = "My Awesome Post!"
slug = slugify(title)
print(slug) # "my-awesome-post"
# Estimate reading time
article = "Lorem ipsum " * 200
reading_time = get_reading_time(article)
print(reading_time) # "2 mins read"
# Capitalize with exceptions
title = "a tale of two cities"
formatted = capitalize_words(title, exceptions=['a', 'of'])
print(formatted) # "A Tale of Two Cities"
# Truncate long text
text = "The quick brown fox jumps over the lazy dog"
short = truncate_text(text, max_length=20)
print(short) # "The quick brown..."
# Remove punctuation
text = "Hello, World!"
clean = remove_punctuation(text)
print(clean) # "Hello World"
# Count words
text = "Hello world hello"
total = count_words(text)
unique = count_words(text, unique=True)
print(f"Total: {total}, Unique: {unique}") # "Total: 3, Unique: 2"
API Reference
remove_extra_whitespace(text: str) -> str
Remove extra whitespace from text, including leading/trailing spaces and multiple consecutive spaces.
Parameters:
text(str): The input text to clean
Returns:
- str: Text with normalized whitespace
Example:
remove_extra_whitespace(" Hello World ")
# "Hello World"
slugify(text: str, separator: str = '-', lowercase: bool = True) -> str
Convert text to a URL-friendly slug.
Parameters:
text(str): The input text to slugifyseparator(str): Character to use as separator (default: '-')lowercase(bool): Convert to lowercase (default: True)
Returns:
- str: URL-friendly slug
Example:
slugify("My Awesome Post!")
# "my-awesome-post"
slugify("Hello, World!", separator='_')
# "hello_world"
get_reading_time(text: str, words_per_minute: int = 200, include_unit: bool = True) -> str | int
Calculate estimated reading time for text.
Parameters:
text(str): The input text to analyzewords_per_minute(int): Average reading speed (default: 200)include_unit(bool): Return formatted string with unit (default: True)
Returns:
- str | int: Reading time as formatted string or integer (minutes)
Example:
get_reading_time("Hello world " * 100)
# "1 min read"
get_reading_time("Hello world " * 100, include_unit=False)
# 1
capitalize_words(text: str, exceptions: Optional[list[str]] = None) -> str
Capitalize the first letter of each word (title case).
Parameters:
text(str): The input text to capitalizeexceptions(list[str], optional): List of words to keep lowercase
Returns:
- str: Text with capitalized words
Example:
capitalize_words("the quick brown fox")
# "The Quick Brown Fox"
capitalize_words("a tale of two cities", exceptions=['a', 'of'])
# "A Tale of Two Cities"
truncate_text(text: str, max_length: int, suffix: str = '...', whole_words: bool = True) -> str
Truncate text to a maximum length.
Parameters:
text(str): The input text to truncatemax_length(int): Maximum length of output textsuffix(str): String to append to truncated text (default: '...')whole_words(bool): Only break at word boundaries (default: True)
Returns:
- str: Truncated text
Example:
truncate_text("The quick brown fox jumps", 15)
# "The quick..."
truncate_text("The quick brown fox jumps", 15, whole_words=False)
# "The quick br..."
remove_punctuation(text: str, keep: Optional[str] = None) -> str
Remove punctuation from text.
Parameters:
text(str): The input text to cleankeep(str, optional): String of punctuation characters to keep
Returns:
- str: Text without punctuation
Example:
remove_punctuation("Hello, World!")
# "Hello World"
remove_punctuation("user@example.com", keep='@.')
# "user@example.com"
count_words(text: str, unique: bool = False) -> int
Count words in text.
Parameters:
text(str): The input text to analyzeunique(bool): Count only unique words (default: False)
Returns:
- int: Word count
Example:
count_words("Hello world hello")
# 3
count_words("Hello world hello", unique=True)
# 2
Running Tests
# Run all tests
python -m pytest tests/
# Run with coverage
python -m pytest tests/ --cov=textprettify
# Run specific test file
python -m pytest tests/test_core.py
Or using unittest:
python -m unittest discover tests
Examples
Check out the examples/ directory for more detailed usage examples:
examples/basic_usage.py- Basic usage examples for all functionsexamples/blog_post_formatter.py- Format blog post metadataexamples/url_generator.py- Generate clean URLs from titles
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Sajith
Version
0.1.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file textprettify-0.1.0.tar.gz.
File metadata
- Download URL: textprettify-0.1.0.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e5621f439875a6b19d8bc370df3ff09145fe65f4ab7e50f0695904a0d49085f
|
|
| MD5 |
2f0db025353749270471803462bf5e7f
|
|
| BLAKE2b-256 |
8acc0689ea3d7aff16610e95d1d091b5523e4ac9a4a393e6ed2fc79c06d8c7d5
|
File details
Details for the file textprettify-0.1.0-py3-none-any.whl.
File metadata
- Download URL: textprettify-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ccef21e985cab1dc7be9e804eee59906f366c4f9d3440dc62a6934f988ad0e1
|
|
| MD5 |
fc1b38b3bf918da3ca8244ffa4efb8fb
|
|
| BLAKE2b-256 |
db1b63dd6873841d0b59bd6c96f27410303d81c712025f7897bdecfc38a4dce6
|