A small example package
Project description
LSA-T: The first continuous LSA dataset
LSA-T is the first continuous Argentinian Sign Language (LSA) dataset. It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer. Videos are in 30 FPS full HD (1920x1080).
- Download link (45GB compressed)
- Visualization notebook
- Presentation paper (TO-DO)
Format
Samples are organized in directories according to the playlists and video they belong to. For each sample i
there are four files:
i.mp4
: the clip corresponding to the ith line of subtitles.i.json
contains:- label: the line of subtitles corresponding to the clip.
- start: time in seconds where the subtitle starts.
- end: time in seconds where the subtitle ends.
- video: title of the video which the clip belongs to.
- playlist: title of the playlist which the clip belongs to.
i_ap.json
: the raw AlphaPose results over the clip using Halpe KeyPoints in AlphaPose default output format.i_signer.json
contains:- scores: for each person in the clip, the amount of "movement" in its hands. It is used to infer who is the signer.
- roi: the considered region of interest of the clip (bounding box of the infered signer).
- keypoints: list of keypoints for each frame of the infered signer in same format that in
i_ap.json
.
Usage
This repository can be installed via pip
and contains the LSA_Dataset
class (in lsat.dataset.LSA_Dataset
module). This class inherits from the Pytorch dataset class and implements all necessary methods for using it with a Pytorch dataloader. It also manages the downloading and extraction of the database.
Also, useful transforms for the clips and keypoints are provided in lsat.dataset.transforms
Statistics and comparison with other DBs
LSA-T | PHOENIX* | SIGNUM | CSL | GSL | KETI | |
---|---|---|---|---|---|---|
language | Spanish | German | German | Chinese | Greek | Korean |
sign language | LSA | GSL | GSL | CSL | GSL | KLS |
real life | Yes | Yes | No | No | No | No |
signers | 103 | 9 | 25 | 50 | 7 | 14 |
duration (h) | 21.78 | 10.71 | 55.3 | 100+ | 9.51 | 28 |
# samples | 14,880 | 7096 | 33,210 | 25,000 | 10,295 | 14,672 |
# unique sentences | 14,254 | 5672 | 780 | 100 | 331 | 105 |
% unique sentences | 95.79% | 79.93% | 2.35% | 0.4% | 3.21% | 0.71% |
vocab. size (w) | 14,239 | 2887 | N/A | 178 | N/A | 419 |
# singletons (w) | 7150 | 1077 | 0 | 0 | 0 | 0 |
% singletons (w) | 50.21% | 37.3% | 0% | 0% | 0% | 0% |
vocab. size (gl) | - | 1066 | 450 | - | 310 | 524 |
# singletons (gl) | - | 337 | 0 | - | 0 | 0 |
# singletons (gl) | - | 31.61% | 0% | - | 0% | 0% |
resolution | 1920x1080 | 210x260 | 776x578 | 1920x1080 | 848x480 | 1920x1080 |
fps | 30 | 25 | 30 | 30 | 30 | 30 |
*Data was not available for the whole PHOENIX dataset, so the table show its train set statistics.
Evaluation splits
LSA-T | Full version | Reduced version | |||
Train | Test | Train | Test | ||
signers | 103 | X | X | X | X |
duration [h] | 21.78 | 17.49 | 4.29 | 15.85 | 3.89 |
# sentences | 14,880 | 11,065 | 2735 | 3767 | 910 |
% unique sentences | 95.79% | 96.64% | 92.78% | 96.88% | 98.35% |
vocab. size | 14,239 | 12,385 | 5546 | 2694 | 1579 |
% singletons | 50.21% | 52.01% | 61.9% | 23.2% | 48.83% |
% sentences with singletons | 34.97% | 40.98% | 67.97% | 14.36% | 54.29% |
% sentences with words not in train vocabulary | - | - | 59.2% | - | 84.5% |
Citation
TO-DO
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lsat-0.0.1.tar.gz
.
File metadata
- Download URL: lsat-0.0.1.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9002b4bf6906b649bfc868225141752ae9746d67d251d61a4bf0ea431097a4f |
|
MD5 | a612d9f988c67117c5bbb675067e00fb |
|
BLAKE2b-256 | 54cd3ee3006e5ee4d122914cb150304af46e47e4d055d7d09e5aca60b3f44f62 |
File details
Details for the file lsat-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: lsat-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4e084912fb7535601d96d55b2f5bcc23b18f049e4170a12677eee3f4c7b7a17 |
|
MD5 | 33bb249e1fa8fdaea1a2fcbc9148b2c8 |
|
BLAKE2b-256 | 56b26036ea71ece98dc3ce12fd56c9a8934f3732fc6cbf6c7af55d4561107c30 |