A small example package
Project description
LSA-T: The first continuous LSA dataset
LSA-T is the first continuous Argentinian Sign Language (LSA) dataset. It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer. Videos are in 30 FPS full HD (1920x1080).
- Download link (45GB compressed)
- Visualization notebook
- Presentation paper (TO-DO)
Format
Samples are organized in directories according to the playlists and video they belong to. For each sample i there are four files:
i.mp4: the clip corresponding to the ith line of subtitles.i.jsoncontains:- label: the line of subtitles corresponding to the clip.
- start: time in seconds where the subtitle starts.
- end: time in seconds where the subtitle ends.
- video: title of the video which the clip belongs to.
- playlist: title of the playlist which the clip belongs to.
i_ap.json: the raw AlphaPose results over the clip using Halpe KeyPoints in AlphaPose default output format.i_signer.jsoncontains:- scores: for each person in the clip, the amount of "movement" in its hands. It is used to infer who is the signer.
- roi: the considered region of interest of the clip (bounding box of the infered signer).
- keypoints: list of keypoints for each frame of the infered signer in same format that in
i_ap.json.
Usage
This repository can be installed via pip and contains the LSA_Dataset class (in lsat.dataset.LSA_Dataset module). This class inherits from the Pytorch dataset class and implements all necessary methods for using it with a Pytorch dataloader. It also manages the downloading and extraction of the database.
Also, useful transforms for the clips and keypoints are provided in lsat.dataset.transforms
Statistics and comparison with other DBs
| LSA-T | PHOENIX* | SIGNUM | CSL | GSL | KETI | |
|---|---|---|---|---|---|---|
| language | Spanish | German | German | Chinese | Greek | Korean |
| sign language | LSA | GSL | GSL | CSL | GSL | KLS |
| real life | Yes | Yes | No | No | No | No |
| signers | 103 | 9 | 25 | 50 | 7 | 14 |
| duration (h) | 21.78 | 10.71 | 55.3 | 100+ | 9.51 | 28 |
| # samples | 14,880 | 7096 | 33,210 | 25,000 | 10,295 | 14,672 |
| # unique sentences | 14,254 | 5672 | 780 | 100 | 331 | 105 |
| % unique sentences | 95.79% | 79.93% | 2.35% | 0.4% | 3.21% | 0.71% |
| vocab. size (w) | 14,239 | 2887 | N/A | 178 | N/A | 419 |
| # singletons (w) | 7150 | 1077 | 0 | 0 | 0 | 0 |
| % singletons (w) | 50.21% | 37.3% | 0% | 0% | 0% | 0% |
| vocab. size (gl) | - | 1066 | 450 | - | 310 | 524 |
| # singletons (gl) | - | 337 | 0 | - | 0 | 0 |
| # singletons (gl) | - | 31.61% | 0% | - | 0% | 0% |
| resolution | 1920x1080 | 210x260 | 776x578 | 1920x1080 | 848x480 | 1920x1080 |
| fps | 30 | 25 | 30 | 30 | 30 | 30 |
*Data was not available for the whole PHOENIX dataset, so the table show its train set statistics.
Evaluation splits
| LSA-T | Full version | Reduced version | |||
| Train | Test | Train | Test | ||
| signers | 103 | X | X | X | X |
| duration [h] | 21.78 | 17.49 | 4.29 | 15.85 | 3.89 |
| # sentences | 14,880 | 11,065 | 2735 | 3767 | 910 |
| % unique sentences | 95.79% | 96.64% | 92.78% | 96.88% | 98.35% |
| vocab. size | 14,239 | 12,385 | 5546 | 2694 | 1579 |
| % singletons | 50.21% | 52.01% | 61.9% | 23.2% | 48.83% |
| % sentences with singletons | 34.97% | 40.98% | 67.97% | 14.36% | 54.29% |
| % sentences with words not in train vocabulary | - | - | 59.2% | - | 84.5% |
Citation
TO-DO
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lsat-0.0.1.tar.gz.
File metadata
- Download URL: lsat-0.0.1.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9002b4bf6906b649bfc868225141752ae9746d67d251d61a4bf0ea431097a4f
|
|
| MD5 |
a612d9f988c67117c5bbb675067e00fb
|
|
| BLAKE2b-256 |
54cd3ee3006e5ee4d122914cb150304af46e47e4d055d7d09e5aca60b3f44f62
|
File details
Details for the file lsat-0.0.1-py3-none-any.whl.
File metadata
- Download URL: lsat-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4e084912fb7535601d96d55b2f5bcc23b18f049e4170a12677eee3f4c7b7a17
|
|
| MD5 |
33bb249e1fa8fdaea1a2fcbc9148b2c8
|
|
| BLAKE2b-256 |
56b26036ea71ece98dc3ce12fd56c9a8934f3732fc6cbf6c7af55d4561107c30
|