Build fcitx5/RIME dictionaries from MediaWiki sites
Project description
[!NOTE] 如果您需要下载萌娘百科 (zh.moegirl.org.cn) 词库,请参见此页。
For the pre-built dictionary for Moegirlpedia (zh.moegirl.org.cn), see the wiki.
[!WARNING]
mw2fcitx0.20.0 包含一些主要和繁简转换相关的 breaking changes。请查看 BREAKING_CHANGES.md 了解更多信息。
mw2fcitx
Build fcitx5/RIME dictionaries from MediaWiki sites.
pip install mw2fcitx
# or if you want to just install for current user
pip install mw2fcitx --user
# or if you want to just run it (needs Pipx)
pipx run mw2fcitx
# or if you need to use OpenCC for text conversion
pip install mw2fcitx[opencc]
CLI Usage
mw2fcitx -c config_script.py
Configuration Script Format
from mw2fcitx.tweaks.moegirl import tweaks
# By default we assume the configuration is located at a variable
# called "exports".
# You can change this with `-n any_name` in the CLI.
exports = {
# Source configurations.
"source": {
# MediaWiki api.php path, if to fetch titles from online.
"api_path": "https://zh.moegirl.org.cn/api.php",
# Title file path, if to fetch titles from local file. (optional)
# Can be a path or a list of paths.
"file_path": ["titles.txt"],
"kwargs": {
# Title number limit for fetching. (optional)
"title_limit": 120,
# Title number limit for fetching via API. (optional)
# Overrides title_limit.
"api_title_limit": 120,
# Title number limit for each fetch via file. (optional)
# Overrides title_limit.
"file_title_limit": 60,
# Partial session file on exception (optional)
"partial": "partial.json",
# Title list export path. (optional)
"output": "titles.txt",
# Delay between MediaWiki API requests in seconds. (optional)
"request_delay": 2,
# Deprecated. Please use `source.kwargs.api_params.aplimit` instead. (optional)
"aplimit": "max",
# Override ALL parameters while calling MediaWiki API.
"api_params": {
# Results per API request; same as `aplimit` in MediaWiki docs. (optional)
"aplimit": "max"
},
# User-Agent used while requesting the API. (optional)
"user_agent": "MW2Fcitx/development"
}
},
# Tweaks configurations as an list.
# Every tweak function accepts a list of titles and return
# a list of title.
"tweaks":
tweaks,
# Converter configurations.
"converter": {
# pypinyin is a built-in converter.
# For custom converter functions, just give the function itself.
"use": "pypinyin",
"kwargs": {
# Replace "m" to "mu" and "n" to "en". Default: False.
# See more in https://github.com/outloudvi/mw2fcitx/issues/29 .
"disable_instinct_pinyin": False,
# Pinyin results to replace. (optional)
# Format: { "汉字": "pin'yin" }
# The result will be sent into `pypinyin` as a phrase, so words containing this phrase are also affected.
"fixfile": "fixfile.json",
# Characters to omit during pinyin conversion. (optional)
# These characters will be automatically removed while trying to convert to pinyin.
# As a result, words containing these characters will not be skipped in the dictionary.
"characters_to_omit": ["·"],
}
},
# Generator configurations.
"generator": [{
# rime is a built-in generator.
# For custom generator functions, just give the function itself.
"use": "rime",
"kwargs": {
# Destination dictionary filename. (optional)
"output": "moegirl.dict.yml"
}
}, {
# pinyin is a built-in generator.
# This generator depends on `libime`.
"use": "pinyin",
"kwargs": {
# Destination dictionary filename. (mandatory)
"output": "moegirl.dict"
}
}]
}
A sample config file is here: sample_config.py
Advanced mode
As mw2fcitx provides the feature to append and override MediaWiki API parameters, it is possible to use it to collect other types of lists in addition to allpages. Please note that if list, action or format is overriden in api_params, mw2fcitx will not automatically append any default parameter (except for format) while sending MediaWiki API requests. Please determine the parameters needed by yourself. A configuration in tests may be helpful for your reference.
Breaking changes across versions
Read BREAKING_CHANGES.md for details.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mw2fcitx-0.24.2.tar.gz.
File metadata
- Download URL: mw2fcitx-0.24.2.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e2e4d5c14649485b69f0adadf00dd9cfae19cb7e61fc1be10c6b7a120fd08f3
|
|
| MD5 |
a3399327ae5ce1d99888613a4cce8ca9
|
|
| BLAKE2b-256 |
82be0dffaea06fe2eb59ff32a91681cfe8215bdc4b52832516a549af720e48bc
|
Provenance
The following attestation bundles were made for mw2fcitx-0.24.2.tar.gz:
Publisher:
publish_package.yml on outloudvi/mw2fcitx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mw2fcitx-0.24.2.tar.gz -
Subject digest:
5e2e4d5c14649485b69f0adadf00dd9cfae19cb7e61fc1be10c6b7a120fd08f3 - Sigstore transparency entry: 810107017
- Sigstore integration time:
-
Permalink:
outloudvi/mw2fcitx@89b51eda4a2548f07f3c3dafd4c0831c70d8df52 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/outloudvi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_package.yml@89b51eda4a2548f07f3c3dafd4c0831c70d8df52 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file mw2fcitx-0.24.2-py3-none-any.whl.
File metadata
- Download URL: mw2fcitx-0.24.2-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ab53f408133d07cc1e4df944091d9a428fb18a19cb33fdcc230fa263e70fed1
|
|
| MD5 |
b27a1fc436e022e9cdf5daf489d815c0
|
|
| BLAKE2b-256 |
5c4aecdbc81b2039e42be1f92d65634a5a6ea82ecb8293a064f0fd84b0f3ea06
|
Provenance
The following attestation bundles were made for mw2fcitx-0.24.2-py3-none-any.whl:
Publisher:
publish_package.yml on outloudvi/mw2fcitx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mw2fcitx-0.24.2-py3-none-any.whl -
Subject digest:
4ab53f408133d07cc1e4df944091d9a428fb18a19cb33fdcc230fa263e70fed1 - Sigstore transparency entry: 810107019
- Sigstore integration time:
-
Permalink:
outloudvi/mw2fcitx@89b51eda4a2548f07f3c3dafd4c0831c70d8df52 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/outloudvi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_package.yml@89b51eda4a2548f07f3c3dafd4c0831c70d8df52 -
Trigger Event:
workflow_dispatch
-
Statement type: