Convert the display names of hashtags in a Mastodon instance to PascalCase
Project description
Mastodon Tag Case Corrector
A Python script for Mastodon instance admins to convert the display names of hashtags to PascalCase, when possible.
There are two possible strategies:
- Find tagged posts to determine the most used casing, and/or
- Split words with wordninja.
Warning
This tool DOES NOT replace due diligence!!! There is NO GUARANTEE that this tool produces correct splittings of hashtags!!! ALWAYS review the log!!!
(However, the tool will not revisit any tags that have been changed either by the tool or manually by the admins.)
Usage
This tool is developed with Python 3.13. Support for previous versions will be provided on a best effort basis.
pip install mastodon-tag-case-corrector
# for all options
mastodon-tag-case-corrector -h
# sample usage
mastodon-tag-case-corrector -i mstdn.example -a MY_SECRET_TOKEN
Configuration
Configuration can be done through command-line options or by specifying variables.
| Variable name | CLI option | Required | Explanation |
|---|---|---|---|
MASTODON_INSTANCE |
-i, --instance [INSTANCE] |
Yes | The domain that the Mastodon API is on, eg. example.com. |
MASTODON_API_KEY |
-a, --auth [AUTH] |
Yes | Mastodon API access token with at least admin:read and admin:write permissions. read:statuses is needed if tagged post analysis is used. To get one, create a new application in User Settings => Development, then navigate to the application detail page and copy "your access token." |
MASTODON_CHECK_TAG_LANGUAGE |
-l, --languages [LANGUAGES ...] |
No | A list of language codes, separated by comma. If supplied, before feeding the hashtag into wordninja, the script will check the language with the most tagged posts in the last 30 days for each hashtag. If the language is one of the supplied languages, the hashtag will be processed; otherwise it will be ignored. This does not affect tagged post checking, and has no effect if WORDNINJA_DISABLE is enabled. Recommended to be set to en for optimal results (as the default corpus used by wordninja is intended to only cover English words), however do note that the /api/v1/dimensions endpoint for determining a hashtag's language tends to be quite slow on instances, so omitting it could shorten execution time, at the possible expense of accuracy. |
MASTODON_DO_ALL_TAGS |
-t, --all-tags |
No | If set to 1, the script will process all tags (from /api/v1/admin/tags), not just trending tags (from /api/v1/admin/trends/tags). Not recommended due to performance issue on the instance side, and that most hashtags tend to lack post statistics for language detection or post analysis to work (even if MASTODON_TAGS_OFFSET is used in combination). |
MASTODON_TAGS_OFFSET |
-o, --offset [OFFSET] |
No | First offset to pass to the first tag-listing request. 0 by default. |
MASTODON_TAGS_DAYS |
--language-days [LANGUAGE_DAYS] |
No | How many days of posts to consider for each hashtag to determine its language. 7 by default. |
MASTODON_CAP_FIRST_LETTER_WHEN_POSSIBLE |
-c, --cap |
No | For hashtags that only consist of one word, cap the first letter if it is most often capped in practice (ie. enforce strict PascalCase even for one word). |
MASTODON_DISABLE_POST_ANALYSIS |
--no-analysis |
No | Disable tagged post analysis. |
MASTODON_DRY_RUN |
-d, --dry-run |
No | Disable actually editing the tag in Mastodon, which is recommended for development and testing purposes. |
WORDNINJA_DISABLE |
--no-wordninja |
No | Disable wordninja detection. |
WORDNINJA_DICTIONARY |
--dictionary [DICTIONARY] |
No | Relative path to a gzipped text file of a list of words (must be in lower case) to consider in descending importance. If not provided, wordninja's default corpus will be used (note that this does not include fediverse jargons). |
Paths are relative to the working folder. For boolean arguments, the equivalent environment variable should be set to 1 for true.
License
Copyright 2024-2025 Austin Huang im@austinhuang.me (https://austinhuang.me). Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mastodon_tag_case_corrector-1.0.0.tar.gz.
File metadata
- Download URL: mastodon_tag_case_corrector-1.0.0.tar.gz
- Upload date:
- Size: 11.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a00756afde3f59fd66e737cf7e0a65dcd032477f5ead44af51dc7ba68566e48
|
|
| MD5 |
7c45ed54dac6a34b809fb0934fbe72ec
|
|
| BLAKE2b-256 |
4a1b5b6a2fe734b52e02b295944df7dd823e0a4989df0f5f58d1dcf08ae408e1
|
File details
Details for the file mastodon_tag_case_corrector-1.0.0-py3-none-any.whl.
File metadata
- Download URL: mastodon_tag_case_corrector-1.0.0-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c7a915f3577bd733e3405d8817c6d95ad87d9c4513a0ac61a451bc92ac81fc0
|
|
| MD5 |
4ade8b18ceba4b8a420d89ec70bb1d5a
|
|
| BLAKE2b-256 |
33449abf3613a46b2f7f8ce590ec918ce79f3f6a09acf0d5adac5bbdbfd2ee1e
|