Python utilities for SignWriting.
Project description
SignWriting
Python utilities for SignWriting.
Installation
pip install git+https://github.com/sign-language-processing/signwriting
Or with Docker:
docker build --platform linux/amd64 --tag signwriting:python .
docker run --platform linux/amd64 --rm -p 9090:8080 -e PORT=8080 signwriting:python
Utilities
signwriting.formats
This module provides utilities for converting between different formats of SignWriting. We include a few examples:
- To parse an FSW string into a
Signobject, representing the sign as a dictionary:
from signwriting.formats.fsw_to_sign import fsw_to_sign
fsw_to_sign("M123x456S1f720487x492")
# {'box': {'symbol': 'M', 'position': (123, 456)}, 'symbols': [{'symbol': 'S1f720', 'position': (487, 492)}]}
- To convert a SignWriting string in SWU format to FSW format:
from signwriting.formats.swu_to_fsw import swu2fsw
swu2fsw('𝠃𝤟𝤩𝣵𝤐𝤇𝣤𝤐𝤆𝣮𝣭')
# M525x535S2e748483x510S10011501x466S2e704510x500S10019476x475
signwriting.tokenizer
This module provides utilities for tokenizing SignWriting strings for use in NLP tasks[^1]. We include a few usage non-exhaustive examples:
- To tokenize a SignWriting string into a list of tokens:
from signwriting.tokenizer import SignWritingTokenizer
tokenizer = SignWritingTokenizer()
fsw = 'M123x456S1f720487x492S1f720487x492'
tokens = list(tokenizer.text_to_tokens(fsw, box_position=True))
# ['M', 'p123', 'p456', 'S1f7', 'c2', 'r0', 'p487', 'p492', 'S1f7', 'c2', 'r0', 'p487', 'p492'])
- To convert a list of tokens back to a SignWriting string:
tokenizer.tokens_to_text(tokens)
# M123x456S1f720487x492S1f720487x492
- For machine learning purposes, we can convert the tokens to a list of integers:
tokenizer.tokenize(fsw, bos=False, eos=False)
# [6, 932, 932, 255, 678, 660, 919, 924, 255, 678, 660, 919, 924]
- Or to remove 'A' information, and separate signs by spaces, we can use:
from signwriting.tokenizer import normalize_signwriting
normalize_signwriting(fsw)
signwriting.visualizer
This module is used to visualize SignWriting strings as images. Unlike sutton-signwriting/font-db which it is based on, this module does not support custom styling. Benchmarks show that this module is ~5000x faster than the original implementation.
from signwriting.visualizer.visualize import signwriting_to_image
fsw = "AS10011S10019S2e704S2e748M525x535S2e748483x510S10011501x466S20544510x500S10019476x475"
signwriting_to_image(fsw)
To use the visualizer with the server, you can hit: https://signwriting-sxie2r74ua-uc.a.run.app//visualizer?fsw=M525x535S2e748483x510S10011501x466S2e704510x500S10019476x475
signwriting.utils
This module includes general utilities that were not covered in the other modules.
join_signsjoins a list of signs into a single sign. This is useful for example for fingerspelling words out of individual character signs.
from signwriting.utils.join_signs import join_signs_vertical
char_a = 'M507x507S1f720487x492'
char_b = 'M507x507S14720493x485'
result_sign = join_signs_vertical(char_a, char_b)
# M510x518S1f720490x481S14720496x496
signwriting.fingerspelling
This module is used to generate spelling data from a list of characters.
from signwriting.fingerspelling.fingerspelling import spell
word = "Hello" # any string of characters
language = "en-us-ase-asl" # long language code, as defined in the fingerspelling README
spell(word, language)
# M515x563S11502477x437S14a20492x457S1dc20484x477S1dc20484x512S17620492x547
To use the fingerspelling with the server, you can hit: https://signwriting-sxie2r74ua-uc.a.run.app//fingerspelling?text=hello&signed_language=ase
signwriting.mouthing
This module is used to generate SpeechWriting from spoken words.
from signwriting.mouthing.mouthing import mouth
word = "Hello" # any string of characters, preferably valid words
language = "eng-Latn" # supported languages under "Language Support" at https://pypi.org/project/epitran/
mouth(word, language)
# M557x518S34700443x482S35c00469x482S34400495x482S34d00521x482
Note: Installing English support for epitran requires extra steps,
see "Install flite" at mouthing/README.md.
To use the mouthing with the server, you can hit: https://signwriting-sxie2r74ua-uc.a.run.app//mouthing?text=hello&spoken_language=eng-Latn
Cite
@misc{moryossef2024-signwriting,
title={Utilities for SignWriting},
author={Moryossef, Amit},
howpublished={\url{https://github.com/sign-language-processing/signwriting}},
year={2024}
}
References
[^1]: Amit Moryossef, Zifan Jiang.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file signwriting-0.1.1.tar.gz.
File metadata
- Download URL: signwriting-0.1.1.tar.gz
- Upload date:
- Size: 6.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbd82605b4c094c6aafd2b810704734d116945906c5b43458ebf5dad023ad9f6
|
|
| MD5 |
fd25d71f71b9596afbcc75df5fb43b66
|
|
| BLAKE2b-256 |
5b4e29d5ec5d73213e4d78946948f22379cca67132b439cdf2e80ce6d8f52b19
|
Provenance
The following attestation bundles were made for signwriting-0.1.1.tar.gz:
Publisher:
release.yaml on sign-language-processing/signwriting
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
signwriting-0.1.1.tar.gz -
Subject digest:
fbd82605b4c094c6aafd2b810704734d116945906c5b43458ebf5dad023ad9f6 - Sigstore transparency entry: 605172746
- Sigstore integration time:
-
Permalink:
sign-language-processing/signwriting@d2bbc559b41dfa3566e264dfbe15f80a5ab7a63f -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/sign-language-processing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@d2bbc559b41dfa3566e264dfbe15f80a5ab7a63f -
Trigger Event:
release
-
Statement type:
File details
Details for the file signwriting-0.1.1-py3-none-any.whl.
File metadata
- Download URL: signwriting-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddb1989f0635ff73549f667a6f2fa41d25d67000cecb2e5b02400b8e63b4ae05
|
|
| MD5 |
ea5757dc8e6d1c3b125e1750d2c5c515
|
|
| BLAKE2b-256 |
ec1b7fd6826a972aca8b18d48393ee3f125e3f0030f20c3a45ae53c4c72455d8
|
Provenance
The following attestation bundles were made for signwriting-0.1.1-py3-none-any.whl:
Publisher:
release.yaml on sign-language-processing/signwriting
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
signwriting-0.1.1-py3-none-any.whl -
Subject digest:
ddb1989f0635ff73549f667a6f2fa41d25d67000cecb2e5b02400b8e63b4ae05 - Sigstore transparency entry: 605172754
- Sigstore integration time:
-
Permalink:
sign-language-processing/signwriting@d2bbc559b41dfa3566e264dfbe15f80a5ab7a63f -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/sign-language-processing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@d2bbc559b41dfa3566e264dfbe15f80a5ab7a63f -
Trigger Event:
release
-
Statement type: