Groningen Translation Environment
Project description
GroTE: Groningen Translation Environment 🐮
Demo example
An online GroTE demo is available at https://gsarti-grote.hf.space. You can use admin as a login code, and upload one of the files in assets/examples for the editing. The demo will log events to the repository grote/grote-logs.
https://github.com/user-attachments/assets/e31d0841-a480-4013-9f9f-2ee8f3885fc5
Running GroTE locally
- Install requirements:
pip install -r requirements.txt. - Make sure you have a local
npminstallation available to run the front-end. - Edit the GroTE config to set your custom
login_codesandevent_logs_hf_dataset_id. By default, you will be able to access the demo using theadmincode, and logs will be written to a locallogsdirectory, and synchronized with a privategrote-logsdataset on your user profile in the Hugging Face Hub. - Run
grotein your command line to start the server. You will need a Hugging Face token withWritepermissions to log edits. - Visit http://127.0.0.1:7860 to access the demo.
- Enter your login code and load an example document from assets/examples.
- Press "📝 Start" to begin editing the document.
Setting up a new GroTE instance on HF Spaces
- Use the "Duplicate this space" option from the original GroTE demo to create a copy in your user/organization profile.
- In Settings > Variables and secrets, change the default value of
EVENT_LOGS_HF_DATASET_ID,HF_TOKENandLOGIN_CODESto your desired values (see GroTE config for more details). - Upon running the app and starting the editing, you should see the logs being written to the dataset having the id is specified in
EVENT_LOGS_HF_DATASET_ID.
Use or modify the following code to create multiple copies of the app programmatically:
from huggingface_hub import duplicate_space, SpaceHardware
NUM_TRANSLATORS = 5
USER_OR_ORG = "<your_username_or_organization>"
YOUR_HF_TOKEN = "hf_<your_token>"
names = [f"translator-{idx}" for idx in range(1, NUM_TRANSLATORS + 1)]
for name in names:
duplicate_space(
from_id="gsarti/grote",
to_id=f"{USER_OR_ORG}/grote-{name}",
private=False,
token=YOUR_HF_TOKEN,
hardware=SpaceHardware.CPU_BASIC,
secrets=[
{
"key": "HF_TOKEN",
"value": YOUR_HF_TOKEN,
"description": " Hugging Face token for logging purposes",
},
{
"key": "LOGIN_CODES",
"value": f"{name.lower()},admin",
"description": "List of login codes for the users",
},
],
variables=[
{
"key": "MAX_NUM_SENTENCES",
"value": "50",
},
{
"key": "EVENT_LOGS_SAVE_FREQUENCY",
"value": "50",
},
{
"key": "EVENT_LOGS_HF_DATASET_ID",
"value": f"{USER_OR_ORG}/grote-{name}",
},
{
"key": "EVENT_LOGS_LOCAL_DIR",
"value": "logs",
},
{
"key": "ALLOWED_TAGS",
"value": "minor,major",
},
{
"key": "TAG_LABLES",
"value": "Minor,Major",
},
{
"key": "TAG_COLORS",
"value": "#ffedd5,#fcd29a",
}
]
)
for name in names:
print(f"URL: https://{USER_OR_ORG}-grote-{name}.hf.space\nLogin code: {name.lower()}")
Editing flow with GroTE
- Open the webpage of the GroTE interface
- Insert the provided login code
- Load one of the provided files
- Press “📝 Start”
- Perform the editing. If needed, use green checkmarks to remove highlights from a segment.
- When all segments for the file are finished, click “✅ Done”
- A message “Saving trial information. Don't close the tab until the download button is available!” will appear. Do not close the tab.
- When the message “Saving complete! Download the output file by clicking the 'Download translations' button below.” appears, click “📥 Download translations” to download the edited files. The file will have the name
<LOGIN CODE>_<FILENAME>_output.txt - Click “⬅️ Back to data loading” to return to the file loading page.
- If needed, pause and take a break
Steps 2-9 are repeated for each file, which represents a standalone document with ordered segments.
Future developments
While the current version of GroTE is functional, there are several improvements that could be made to enhance the user experience and functionality. I am unlikely to implement these changes in the near future, but I am happy to provide guidance and support to anyone interested in contributing to the project.
- Separate rendering logic for loading/editing tabs (see ICLR 2024 Papers interface for an example)
- Use latest Gradio version to integrate features like multi-page structure, client-side functions, and dynamic rendering of components.
- Enable restoring the previous state of edited sentences if matching filename and user are found in the logs in the past 24 hours (with a modal to enable starting from scratch).
- Possibly rethink logging format to reduce redundancy and improve readability.
- Add optional tab to visualize the editing process (e.g., Highlighted diffs between original and edited sentences, replay of editing process by looping
.thenwithtime.sleep, download scoped logs for single text). - Change saving logic to use BackgroundScheduler
- Change transition from editing to loading to preserve login code and possibly allow the pre-loading of several files for editing (would require a custom
FileExplorercomponent to mark done documents).
Questions and feedback
If you have any questions or feedback, please feel free to reach out to me at gabriele.sarti996@gmail.com.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grote-0.1.11.tar.gz.
File metadata
- Download URL: grote-0.1.11.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.12 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cfbb8c6750600f92bc37741ad07efced0820ef65d8b8687c68e309be9bc88bf
|
|
| MD5 |
0d5ade50a208de09788bc9c86c8f34e8
|
|
| BLAKE2b-256 |
c929ce44ae20a8f063f108cd8b07750e274ea49efbb00fb84f6dd2b976ecb9f0
|
File details
Details for the file grote-0.1.11-py3-none-any.whl.
File metadata
- Download URL: grote-0.1.11-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.12 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d843cb16c8ab8cd206bc914f7cb3e2b621c8ccdc8711c7a990425ee5121c8731
|
|
| MD5 |
b2bcf75a8e0f57e739566678d581eb06
|
|
| BLAKE2b-256 |
0f9b923187be86e5fb8055d3899846551242f9e3810e7e882912834454d54921
|