Simple functions to move by words in a string
Project description
cursword
Simple functions to move by words in a string.
- Useful for word-based cursors.
- Use Unicode character classification to detect word boundaries.
Installation
pip install cursword
Quick Start
from cursword import get_next_word_end_position, get_previous_word_start_position
text = "Hello world, how are you?"
# Find the end of the first word starting from position 0
next_pos = get_next_word_end_position(text, 0)
print(f"First word ends at position {next_pos}: '{text[:next_pos]}'")
# Output: First word ends at position 5: 'Hello'
# Find the start of the last word from the end
prev_pos = get_previous_word_start_position(text, len(text))
print(f"Last word starts at position {prev_pos}: '{text[prev_pos:]}'")
# Output: Last word starts at position 24: '?'
API Reference
get_next_word_end_position(text: str, start: int) -> int
Returns the position of the end of the current or next word in the given text.
Parameters:
text(str): The text to search instart(int): The position to start searching from
Returns:
int: The position after the end of the current/next word, orlen(text)if no word is found
Example:
from cursword import get_next_word_end_position
text = "abc def ghi"
pos = get_next_word_end_position(text, 0) # Returns 3 (end of "abc")
pos = get_next_word_end_position(text, pos) # Returns 7 (end of "def")
get_previous_word_start_position(text: str, start: int) -> int
Returns the position of the start of the previous word in the given text.
Parameters:
text(str): The text to search instart(int): The position to start searching from
Returns:
int: The position of the start of the previous word, or0if no word is found
Example:
from cursword import get_previous_word_start_position
text = "abc def ghi"
pos = get_previous_word_start_position(text, len(text)) # Returns 8 (start of "ghi")
pos = get_previous_word_start_position(text, pos) # Returns 4 (start of "def")
How It Works
The library categorizes characters into different types:
- Word characters: Letters, numbers, and underscores
- Punctuation: Various punctuation marks and mathematical symbols
- Currency: Currency symbols
- Space: Whitespace characters
- Other: All other characters (including CJK ideographs)
Word boundaries are detected when transitioning between different character categories, allowing for intelligent navigation through mixed content.
Known limitations
- CJK Text: Chinese, Japanese, and Korean characters are currently treated as single blocks rather than individual word units.
Development
# Best suited for uv
uv sync
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=cursword --cov-report=html --cov-report=term-missing
# Lint code
uv run ruff check
# Format code
uv run ruff format
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cursword-0.1.1.tar.gz.
File metadata
- Download URL: cursword-0.1.1.tar.gz
- Upload date:
- Size: 14.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d033248fde226cf4b71f950a141c2d74eabc2b86a2ff7f179ff89b13a21c746e
|
|
| MD5 |
2d930e3a74265e80e3b381e426c29c79
|
|
| BLAKE2b-256 |
b3b6c79e7de2ae61cf6a53a7f74adb18aa88d2fbcf0f59ed6307a0f0fce57e63
|
File details
Details for the file cursword-0.1.1-py3-none-any.whl.
File metadata
- Download URL: cursword-0.1.1-py3-none-any.whl
- Upload date:
- Size: 3.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
794b8de4a49ffb47e081a4959d4cc6c588bad960af01f3f41592ddd854ac0a78
|
|
| MD5 |
2e5ad068354d3bccceed96255d50e986
|
|
| BLAKE2b-256 |
6fc9d220adf93de4fb28746c8d6e58fe5c68889752d29e78375cc35de96704df
|