A comprehensive string utility library for Python.
Project description
py-text-toolkit
A lightweight, dependency-minimal Python library for everyday string operations — cleaning, validation, analysis, case conversion, and generation.
Installation
pip install py-text-toolkit
Requires: Python 3.8+
Optional dependency:emoji(required only forcleaning.remove_emojis)
Modules at a Glance
| Module | What it does |
|---|---|
strutils.cleaning |
Strip, replace, and normalize raw text |
strutils.validation |
Validate emails, URLs, passwords, and character sets |
strutils.analysis |
Count, compare, and measure strings |
strutils.format_cases |
Convert between naming conventions and formatting styles |
strutils.generation |
Generate slugs, masks, ciphers, and reversed strings |
Quick Start
from strutils.cleaning import remove_html_tags, remove_urls
from strutils.validation import is_email, is_strong_password
from strutils.analysis import word_count, is_palindrome
from strutils.format_cases import to_snake_case, to_camel_case
from strutils.generation import generate_slug, mask_range
# Clean
remove_html_tags("<p>Hello <b>world</b></p>") # "Hello world"
remove_urls("Visit https://example.com today") # "Visit today"
# Validate
is_email("user@example.com") # True
is_strong_password("Passw0rd!") # True
# Analyse
word_count("Hello, world!") # 2
is_palindrome("A man a plan a canal Panama") # True
# Convert case
to_snake_case("camelCaseText") # "camel_case_text"
to_camel_case("hello_world") # "helloWorld"
# Generate
generate_slug("Hello World!") # "hello-world"
mask_range("1234-5678-9012", 5, 9, "*") # "1234-****-9012"
Module Reference
strutils.cleaning
Functions for sanitising and normalising raw text.
| Function | Signature | Description |
|---|---|---|
normalize_whitespace |
(text) → str |
Collapse all whitespace runs to a single space and strip ends |
remove_punctuation |
(text, replace="") → str |
Remove or replace all punctuation characters |
remove_digits |
(text, replace="") → str |
Remove or replace all digit characters |
remove_html_tags |
(text, replace="") → str |
Strip or replace HTML tags |
remove_urls |
(text, replace="") → str |
Remove or replace HTTP/HTTPS and www. URLs |
remove_emojis |
(text, replace="") → str |
Remove or replace emoji characters (requires emoji) |
collapse_spaces |
(text) → str |
Remove all whitespace (not just collapse) |
All cleaning functions accept an optional replace argument — the string substituted in place of each removed element (defaults to ""). After replacement, whitespace is always normalized.
from strutils.cleaning import remove_punctuation, remove_html_tags, remove_emojis
remove_punctuation("Hello, world!") # "Hello world"
remove_punctuation("Hello, world!", replace=" ") # "Hello world"
remove_html_tags("<p>Hello <b>world</b></p>") # "Hello world"
remove_html_tags("<br/>line1<br/>line2", replace=" ") # "line1 line2"
remove_emojis("Great job! 🎉") # "Great job!"
remove_emojis("Hello 😊", replace="[emoji]") # "Hello [emoji]"
strutils.validation
Boolean predicates for common string formats.
| Function | Signature | Description |
|---|---|---|
is_email |
(text) → bool |
Check for a valid email address |
is_url |
(text) → bool |
Check for a valid HTTP or HTTPS URL |
contains_only |
(text, allowed_chars) → bool |
Check that every character is in the allowed set |
is_strong_password |
(text) → bool |
Check that a password meets strength requirements |
Password requirements (is_strong_password):
- Minimum 8 characters
- At least one lowercase letter
- At least one uppercase letter
- At least one digit
- At least one special character from
@$!%*?&
from strutils.validation import is_email, is_url, contains_only, is_strong_password
is_email("user@example.com") # True
is_email("not-an-email") # False
is_url("https://api.service.io/v1") # True
is_url("ftp://files.example.com") # False
contains_only("12345", "0123456789") # True
contains_only("hello!", "a-z") # False (literal chars only, not a range)
is_strong_password("Passw0rd!") # True
is_strong_password("weakpass") # False
Note on
contains_only:allowed_charsis treated as a set of literal characters. Special regex characters are escaped automatically, so"a-z"matches only the three charactersa,-, andz, not a range.
strutils.analysis
Functions that measure and compare strings.
| Function | Signature | Description |
|---|---|---|
word_count |
(text) → int |
Count words using regex word-boundary matching |
char_frequency |
(text, char) → int |
Count non-overlapping occurrences of a character or substring |
count_vowels |
(text) → int |
Count English vowels (a e i o u), case-insensitive |
longest_word |
(text) → int |
Return the length of the longest whitespace-delimited word |
is_palindrome |
(text, case_sensitive=False, ignore_formatting=True) → bool |
Check if a string is a palindrome |
is_anagram |
(word1, word2) → bool |
Check if two strings are anagrams (case-insensitive, ignores spaces) |
from strutils.analysis import word_count, is_palindrome, is_anagram, char_frequency
word_count("Hello, world!") # 2
word_count(" spaces everywhere ") # 2
char_frequency("banana", "an") # 2
is_palindrome("racecar") # True
is_palindrome("A man a plan a canal Panama") # True
is_palindrome("Racecar", case_sensitive=True) # False
is_anagram("listen", "silent") # True
is_anagram("Astronomer", "Moon starer") # True
strutils.format_cases
Convert strings between naming conventions and apply text formatting.
| Function | Signature | Description |
|---|---|---|
to_snake_case |
(text) → str |
Convert to snake_case |
to_camel_case |
(text) → str |
Convert to camelCase |
to_pascal_case |
(text) → str |
Convert to PascalCase |
to_kebab_case |
(text) → str |
Convert to kebab-case |
to_title_case |
(text) → str |
Convert to Title Case |
truncate |
(text, max_length, suffix="...") → str |
Truncate to a maximum length with a suffix |
pad_center |
(text, width, fillchar=" ") → str |
Center-pad to a given width |
All case converters handle mixed input (camelCase, PascalCase, snake_case, kebab-case, spaces).
from strutils.format_cases import to_snake_case, to_camel_case, truncate, pad_center
to_snake_case("camelCaseText") # "camel_case_text"
to_snake_case("Hello World!") # "hello_world"
to_camel_case("hello_world") # "helloWorld"
to_camel_case("PascalCaseText") # "pascalCaseText"
to_pascal_case("kebab-case-text") # "KebabCaseText"
to_kebab_case("camelCaseText") # "camel-case-text"
to_title_case("hello_world") # "Hello World"
truncate("Hello, World!", 8) # "Hello..."
truncate("Hi", 10) # "Hi"
pad_center("hello", 11) # " hello "
pad_center("hi", 10, "-") # "----hi----"
strutils.generation
Functions that produce new strings from existing ones.
| Function | Signature | Description |
|---|---|---|
generate_slug |
(text) → str |
Convert to a URL-friendly slug |
reverse_word |
(text) → str |
Reverse all characters |
mask_range |
(text, start_index, end_index, placeholder="X") → str |
Mask a character range with a placeholder |
ceasar_cipher |
(text, shift) → str |
Encrypt/decrypt with the Caesar cipher |
from strutils.generation import generate_slug, mask_range, ceasar_cipher, reverse_word
generate_slug("Hello World!") # "hello-world"
generate_slug("Python 3.11 -- Release Notes") # "python-3-11-release-notes"
reverse_word("hello") # "olleh"
mask_range("1234-5678-9012", 5, 9, "*") # "1234-****-9012"
mask_range("secret", -3, -1) # "secXXt"
ceasar_cipher("Hello, World!", 3) # "Khoor, Zruog!"
ceasar_cipher("Khoor, Zruog!", -3) # "Hello, World!" (decrypt)
Dependencies
| Package | Required | Used by |
|---|---|---|
re (stdlib) |
Always | All modules |
string (stdlib) |
Always | cleaning |
emoji |
Optional | cleaning.remove_emojis only |
License
MIT License — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_text_toolkit-0.1.1.tar.gz.
File metadata
- Download URL: py_text_toolkit-0.1.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c799b5ca2b474c6d7303fea77b1b04bbe4bfa7b93918b8f73589bc2a25daf108
|
|
| MD5 |
5fa387c5db512221809cc41ff1d7135b
|
|
| BLAKE2b-256 |
10da92c36f4385693e472f85ce299bddb833433461e464a35cf453345c43121a
|
File details
Details for the file py_text_toolkit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: py_text_toolkit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72317b0f6968c7f02bd94d4253f2a7cb71cee104fe64e75d98e16989ca89492d
|
|
| MD5 |
81ff6120e268eb206bb60157a991bd30
|
|
| BLAKE2b-256 |
eb4cdb9897d1893ae4be37faa9271316b5cf21195e1f2ba955fb62e66e86905d
|