A personal Python toolkit for common tasks

Project description

toolkitx

A personal Python toolkit for common tasks. This package provides various utility functions to simplify common development workflows.

Features

Text Utilities (toolkitx.text_utils):
- truncate_text_smart: Smartly truncates text by characters or words, with options for suffix and tolerance, attempting to preserve sentence or word boundaries.
- split_text_by_word_count: Splits long text into overlapping chunks based on word count.
Task Utilities (toolkitx.task_utils):
- with_resilience: A decorator for API resilience with rate limiting (QPS), exponential backoff retry, and jitter to prevent thundering herd.
- PersistentTaskQueue: A persistent task queue with SQLite backend, supporting concurrent processing, automatic retry, crash recovery, and graceful shutdown.
Experimental Translator (toolkitx.lab.translator):
- Translator: A class providing translation capabilities using Baidu or Tencent translation APIs, with disk-based caching for performance. (Requires API credentials)

Installation

Clone the repository:

git clone https://github.com/ider-zh/toolkitx.git
cd toolkitx

Install the package. For development, you can install it in editable mode with development dependencies:
```
pip install -e ".[dev]"
```
For regular installation:
```
pip install .
```

Usage

Text Utilities

from toolkitx import truncate_text_smart, split_text_by_word_count

# Smart Truncation
text = "This is a very long sentence that needs to be truncated."
truncated_char = truncate_text_smart(text, limit=20, mode="char", suffix="...")
print(f"Char truncated: {truncated_char}")

truncated_word = truncate_text_smart(text, limit=5, mode="word", suffix="...")
print(f"Word truncated: {truncated_word}")

# Split Text
long_text = "This is a long piece of text that we want to split into several smaller chunks with some overlap between them for context."
chunks = split_text_by_word_count(long_text, max_words=10, overlap=2)
for i, chunk in enumerate(chunks):
    print(f"Chunk {i+1}: {chunk}")

Task Utilities

with_resilience Decorator

from toolkitx.task_utils import with_resilience
import requests

@with_resilience(qps=5.0, max_retries=3, base_delay=1.0, max_delay=60.0)
def call_api_with_retry(url: str) -> dict:
    """Call API with automatic retry and rate limiting"""
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return response.json()

# The decorator will automatically:
# - Limit requests to 5 per second (QPS)
# - Retry up to 3 times on failure with exponential backoff
# - Add random jitter to prevent thundering herd
result = call_api_with_retry("https://api.example.com/data")

PersistentTaskQueue

import polars as pl
from pydantic import BaseModel
from toolkitx.task_utils import PersistentTaskQueue
import tempfile

# Define your data model
class EntityModel(BaseModel):
    name: str
    is_company: bool

# Define your processing function
def extract_entity(text: str) -> EntityModel:
    """Extract entity information from text"""
    # Your processing logic here
    return EntityModel(name=text.split()[0], is_company=True)

# Prepare data
df = pl.DataFrame({
    "batch_id": ["batch1", "batch1", "batch2"],
    "input_text": ["Apple Inc.", "Google Corp.", "Microsoft"]
})

# Initialize queue with temporary database
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as f:
    db_path = f.name

queue = PersistentTaskQueue(db_path=db_path, task_name="entity_extraction", max_retries=3)

# Setup and enqueue data
queue.setup()
queue.enqueue_dataframe(df)

# Process tasks with concurrency control (supports Ctrl+C for graceful shutdown)
queue.process(worker_func=extract_entity, concurrency=10)

# Get results
results = queue.get_results(response_model=EntityModel)
print(results)

Changelog

v0.0.4 (2026-03-07)

Added task_utils module with with_resilience decorator for API resilience
Added PersistentTaskQueue class for persistent task processing with SQLite backend
Added comprehensive documentation for new features
Bumped version to 0.0.4
Updated dependencies (httpx, tencentcloud-sdk-python, pytest, etc.)
Removed hello script and related functionality
Added polars, pydantic, and tqdm as dependencies
Improved translator module to use tempfile for cache paths

Project details

Release history Release notifications | RSS feed

This version

0.0.4

Mar 7, 2026

0.0.3

May 9, 2025

0.0.2

May 9, 2025

0.0.1

May 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolkitx-0.0.4.tar.gz (43.5 kB view details)

Uploaded Mar 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toolkitx-0.0.4-py3-none-any.whl (18.1 kB view details)

Uploaded Mar 7, 2026 Python 3

File details

Details for the file toolkitx-0.0.4.tar.gz.

File metadata

Download URL: toolkitx-0.0.4.tar.gz
Upload date: Mar 7, 2026
Size: 43.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.3

File hashes

Hashes for toolkitx-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`e35835718f2b69b3150ee15b5c80635004178f96602900d9e10a0389e013bd8b`
MD5	`e8d845e68e2ee32fc333907bd0c23206`
BLAKE2b-256	`5c20e98e8ffc59b5b91cc2e5a2e3a8f18f03b27be1e7437ac0e8efb42101060f`

See more details on using hashes here.

File details

Details for the file toolkitx-0.0.4-py3-none-any.whl.

File metadata

Download URL: toolkitx-0.0.4-py3-none-any.whl
Upload date: Mar 7, 2026
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.3

File hashes

Hashes for toolkitx-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d37e0fd66cc71e582e546129e804c3f2bc35537ce8eb662880c4f7d9088b3096`
MD5	`dfa9a933454b85130ba9b0e552211cd9`
BLAKE2b-256	`7310580dcc8ef85a90a2d0b2235e8f6f90c6c33df84653f715884c6d2b35dae7`

See more details on using hashes here.

toolkitx 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

toolkitx

Features

Installation

Usage

Text Utilities

Task Utilities

with_resilience Decorator

PersistentTaskQueue

Changelog

v0.0.4 (2026-03-07)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes