Persian Natural Language Inference DataSet
Project description
FarsTail: A Persian Natural Language Inference Dataset
Natural Language Inference (NLI), also called Textual Entailment, is an important task in NLP with the goal of determining the inference relationship between a premise p and a hypothesis h. It is a three-class problem, where each pair (p, h) is assigned to one of these classes: "ENTAILMENT" if the hypothesis can be inferred from the premise, "CONTRADICTION" if the hypothesis contradicts the premise, and "NEUTRAL" if none of the above holds.
There are large datasets such as SNLI, MNLI, and SciTail for NLI in English, but there are few datasets for poor-data languages like Persian.
Persian (Farsi) language is a pluricentric language spoken by around 110 million people in countries like Iran, Afghanistan, and Tajikistan. Here, we present the first relatively large-scale Persian dataset for NLI task, called FarsTail. A total of 10,367 samples are generated from a collection of 3,539 multiple-choice questions. The train, validation, and test portions include 7,266, 1,537, and 1,564 instances, respectively.
Getting started with package
We have provided an API in the form of a python package to read and use FarsTail easier for persian and non-persian language researchers. In the following, we will explain how to use this package.
You'll need Python 3.6 or higher.
Installation
pip install farstail
using
- Loading the original FarsTail dataset.
from farstail.datasets import farstail
train_data, val_data, test_data = farstail.load_original_data()
- Loading the indexed FarsTail dataset.
from farstail.datasets import farstail
train_ind, val_ind, test_ind, dictionary = farstail.load_indexed_data()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
farstail-1.0.5.tar.gz
(21.2 kB
view details)
Built Distribution
farstail-1.0.5-py3-none-any.whl
(24.8 kB
view details)
File details
Details for the file farstail-1.0.5.tar.gz
.
File metadata
- Download URL: farstail-1.0.5.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3d5308dc1d7146ea269d9d6345013b69bc647a6f5e7dc2cf21a9fb0abc29030 |
|
MD5 | ecd9c617f438b3656a6375d15529df52 |
|
BLAKE2b-256 | 1e9caf2a4cf10ccc61a65a39899f0cb74b1a4a08beb1c9fcdf4003300adae03f |
File details
Details for the file farstail-1.0.5-py3-none-any.whl
.
File metadata
- Download URL: farstail-1.0.5-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac0f66c182c959e16f8f6e99bb587e1905e949ecc2fafb05b826bccf4f6b32da |
|
MD5 | d965b26566fce2e3e6863f3cdc69df78 |
|
BLAKE2b-256 | 723a83d5b4b8ff5b5014b75e193edfb7a2f280b9c354be11c600ded904df6f83 |