Cross-language slug generator for URL-safe and filename-safe strings
Project description
post_slug
A consistent, cross-language slug generator for creating URL-safe and filename-safe strings from any text input.
🎯 Overview
post_slug converts any text into clean, readable slugs that are safe for URLs, filenames, and other contexts where only ASCII alphanumeric characters are allowed. With identical implementations in Python, JavaScript, PHP, and Bash, it ensures consistent output across your entire stack.
Key Features
- 🌐 Cross-language consistency - Identical output across Python, JavaScript, PHP, and Bash
- 🛡️ Security-focused - Input sanitization with 255-character limit to prevent DoS attacks
- 🔧 Flexible configuration - Customizable separator, case preservation, and length limits
- 📦 Zero dependencies - Uses only built-in language features
- ⚡ Fast and lightweight - Optimized for performance
- 🧪 Thoroughly tested - Comprehensive test suite with cross-language validation
Quick Example
from post_slug import post_slug
# Convert a complex title to a clean slug
title = "The Ŝtřãņġę (Inner) Life! of the \"Outsider\""
slug = post_slug(title)
# Output: "the-strange-inner-life-of-the-outsider"
📦 Installation
Python
# Install from PyPI (coming soon)
pip install post-slug
# Or install from source
python -m pip install .
JavaScript/Node.js
# Install from npm (coming soon)
npm install post-slug
# Or use directly
const { post_slug } = require('./post_slug.js');
PHP
# Install via Composer (coming soon)
composer require open-technology-foundation/post-slug
# Or include directly
require_once 'post_slug.php';
Bash
# Source the function
source post_slug.bash
# Or add to your .bashrc
echo 'source /path/to/post_slug.bash' >> ~/.bashrc
🚀 Usage
Basic Usage
All implementations share the same API:
post_slug(input_str, [sep_char], [preserve_case], [max_len])
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
input_str |
string | required | The text to convert into a slug |
sep_char |
string | '-' |
Character to replace non-alphanumeric characters |
preserve_case |
bool/int | false/0 |
Whether to preserve the original case |
max_len |
int | 0 |
Maximum length (0 = no limit beyond 255 chars) |
Language-Specific Examples
Python
from post_slug import post_slug
# Basic usage
slug = post_slug("Hello, World!")
print(slug) # "hello-world"
# With underscore separator
slug = post_slug("Hello, World!", "_")
print(slug) # "hello_world"
# Preserve case
slug = post_slug("Hello, World!", "-", True)
print(slug) # "Hello-World"
# With max length
slug = post_slug("This is a very long title that needs truncation", "-", False, 20)
print(slug) # "this-is-a-very-long"
# HTML entities are replaced
slug = post_slug("Barnes & Noble")
print(slug) # "barnes-noble"
# Special characters and Unicode
slug = post_slug("Über die Universitäts-Philosophie — Schopenhauer, 1851")
print(slug) # "uber-die-universitats-philosophie-schopenhauer-1851"
JavaScript
const { post_slug } = require('./post_slug.js');
// Basic usage
let slug = post_slug("Hello, World!");
console.log(slug); // "hello-world"
// With underscore separator
slug = post_slug("Hello, World!", "_");
console.log(slug); // "hello_world"
// Preserve case
slug = post_slug("Hello, World!", "-", true);
console.log(slug); // "Hello-World"
// With max length
slug = post_slug("This is a very long title that needs truncation", "-", false, 20);
console.log(slug); // "this-is-a-very-long"
// Works with modern JavaScript
const titles = [
"The Great Gatsby",
"Pride & Prejudice",
"1984"
];
const slugs = titles.map(title => post_slug(title));
// ["the-great-gatsby", "pride-prejudice", "1984"]
PHP
<?php
require_once 'post_slug.php';
// Basic usage
$slug = post_slug("Hello, World!");
echo $slug; // "hello-world"
// With underscore separator
$slug = post_slug("Hello, World!", "_");
echo $slug; // "hello_world"
// Preserve case
$slug = post_slug("Hello, World!", "-", true);
echo $slug; // "Hello-World"
// With max length
$slug = post_slug("This is a very long title that needs truncation", "-", false, 20);
echo $slug; // "this-is-a-very-long"
// In a WordPress context
function my_custom_slug($title) {
return post_slug($title, '-', false, 200);
}
add_filter('sanitize_title', 'my_custom_slug');
Bash
# Source the function
source post_slug.bash
# Basic usage
slug=$(post_slug "Hello, World!")
echo "$slug" # "hello-world"
# With underscore separator
slug=$(post_slug "Hello, World!" "_")
echo "$slug" # "hello_world"
# Preserve case
slug=$(post_slug "Hello, World!" "-" 1)
echo "$slug" # "Hello-World"
# With max length
slug=$(post_slug "This is a very long title that needs truncation" "-" 0 20)
echo "$slug" # "this-is-a-very-long"
# Batch processing files
for file in *.txt; do
new_name="$(post_slug "${file%.txt}").txt"
mv "$file" "$new_name"
done
🛠️ Advanced Features
Batch File Renaming
The included slug-files utility allows batch renaming of files:
# Rename all .txt files in a directory
./slug-files *.txt
# With custom separator and preserved case
./slug-files -s _ -p /path/to/files/*
# Dry run (no actual renaming)
./slug-files -n *.pdf
Command-Line Usage
Create convenient command-line aliases:
# Add to your shell configuration
ln -s /path/to/post_slug.bash /usr/local/bin/post_slug
ln -s /path/to/slug-files /usr/local/bin/slug-files
# Now use directly
post_slug "My Document Title!"
# Output: my-document-title
🔧 How It Works
The slug generation process follows these steps:
- Input validation - Truncates input to 255 characters for filesystem safety
- Character normalization - Applies language-specific transliteration fixes
- HTML entity removal - Replaces entities like
&with the separator - ASCII transliteration - Converts Unicode to closest ASCII equivalents
- Quote removal - Strips quotes, apostrophes, and backticks
- Case conversion - Optionally converts to lowercase
- Character replacement - Replaces non-alphanumeric chars with separator
- Cleanup - Removes duplicate/leading/trailing separators
- Length truncation - Optionally truncates to specified length
Transliteration Details
Different languages use different transliteration methods:
- Python/JavaScript:
unicodedata.normalize('NFKD') - PHP/Bash:
iconv('UTF-8', 'ASCII//TRANSLIT')
To ensure consistency, manual transliteration tables ("kludges") handle edge cases:
# Example kludges
'€' → 'EUR' # Euro symbol
'©' → 'C' # Copyright
'®' → 'R' # Registered trademark
'™' → '-TM' # Trademark
🧪 Testing
Running Tests
# Run cross-language validation
cd unittests
./validate_slug_scripts datasets/headlines.txt
# Test with specific parameters
./validate_slug_scripts datasets/booktitles.txt 0 '-,_' '0,1'
# Quiet mode (errors only)
./validate_slug_scripts -q datasets/products.txt
Test Datasets
The package includes extensive test datasets:
headlines.txt- News headlines with special charactersbooktitles.txt- Book titles with Unicode and punctuationproducts.txt- Product names with symbols and numbersedge_cases.txt- Boundary conditions and special cases
Unit Tests
# Python unit tests
python -m pytest unittests/test_post_slug.py
# Run all language tests
python unittests/test_post_slug.py
⚠️ Important Notes
Character Set Limitations
Some character sets cannot be transliterated to ASCII and will result in empty strings:
# Cyrillic text returns empty string
post_slug("Привет мир") # ""
# Use with Latin-based alphabets for best results
post_slug("Café résumé") # "cafe-resume"
Security Considerations
- Input is automatically limited to 255 characters
- All implementations include error handling
- Safe for user-generated content
- No external command execution (except Bash
iconv)
Version Compatibility
| Language | Minimum Version | Tested Version |
|---|---|---|
| Python | 3.10 | 3.12 |
| PHP | 8.0 | 8.3 |
| Bash | 5.1 | 5.2 |
| Node.js | 12.2 | 20.x |
🤝 Contributing
We welcome contributions! Please follow these guidelines:
- Consistency is key - Changes must be applied to all language implementations
- Test thoroughly - Run
validate_slug_scriptsto ensure cross-language compatibility - Update kludge tables - Submit PRs for new transliteration cases
- Follow conventions - Match the coding style of each language
Development Setup
# Clone the repository
git clone https://github.com/Open-Technology-Foundation/post_slug.git
cd post_slug
# Run tests
cd unittests
./validate_slug_scripts datasets/headlines.txt
# Make changes and verify
# ... edit files ...
./validate_slug_scripts -q datasets/booktitles.txt
📄 License
This project is licensed under the GNU General Public License v3.0 or later - see the LICENSE file for details.
🙏 Acknowledgments
- Inspired by various slug generation libraries across different languages
- Test datasets compiled from real-world content
- Special thanks to all contributors
📚 See Also
- CLAUDE.md - AI assistant guidelines
- AUDIT-EVALUATE.md - Security and code quality audit
- PURPOSE-FUNCTIONALITY-USAGE.md - Detailed documentation
Repository: https://github.com/Open-Technology-Foundation/post_slug
Author: Gary Dean garydean@okusi.id
Version: 1.0.1
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file post_slug-1.0.1.tar.gz.
File metadata
- Download URL: post_slug-1.0.1.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f9d2e67fc2b5351ab8473ad76961e3e569e77f7a25079ab69d76b15132e8c4b
|
|
| MD5 |
96693ed62300f8323d916db4c9d1ebbd
|
|
| BLAKE2b-256 |
10e2b86e0cca8d9f8ec51f15042884a90830390bba5df940d17eebb6df2df581
|
File details
Details for the file post_slug-1.0.1-py3-none-any.whl.
File metadata
- Download URL: post_slug-1.0.1-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f8ef99f3ec324c641ed39f7e81955723b54db2d1656a452a8aa753ec4b18603
|
|
| MD5 |
09b5f3e5413ea994da10eed75bea37fc
|
|
| BLAKE2b-256 |
838dc4c636caa25d52072ebde6744de0f9a123968730d2ba9587f106000260d1
|