Skip to main content

Persian Natural Language Inference DataSet

Project description

FarsTail: A Persian Natural Language Inference Dataset



Natural Language Inference (NLI), also called Textual Entailment, is an important task in NLP with the goal of determining the inference relationship between a premise p and a hypothesis h. It is a three-class problem, where each pair (p, h) is assigned to one of these classes: "ENTAILMENT" if the hypothesis can be inferred from the premise, "CONTRADICTION" if the hypothesis contradicts the premise, and "NEUTRAL" if none of the above holds.
There are large datasets such as SNLI, MNLI, and SciTail for NLI in English, but there are few datasets for poor-data languages like Persian.
Persian (Farsi) language is a pluricentric language spoken by around 110 million people in countries like Iran, Afghanistan, and Tajikistan. Here, we present the first relatively large-scale Persian dataset for NLI task, called FarsTail. A total of 10,367 samples are generated from a collection of 3,539 multiple-choice questions. The train, validation, and test portions include 7,266, 1,537, and 1,564 instances, respectively.

Getting started with package

We have provided an API in the form of a python package to read and use FarsTail easier for persian and non-persian language researchers. In the following, we will explain how to use this package.

You'll need Python 3.6 or higher.

Installation

pip install farstail

using

  • Loading the original FarsTail dataset.
from farstail.datasets import farstail
train_data, val_data, test_data = farstail.load_original_data()
  • Loading the indexed FarsTail dataset.
from farstail.datasets import farstail
train_ind, val_ind, test_ind, dictionary = farstail.load_indexed_data()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

farstail-1.0.5.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

farstail-1.0.5-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file farstail-1.0.5.tar.gz.

File metadata

  • Download URL: farstail-1.0.5.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for farstail-1.0.5.tar.gz
Algorithm Hash digest
SHA256 b3d5308dc1d7146ea269d9d6345013b69bc647a6f5e7dc2cf21a9fb0abc29030
MD5 ecd9c617f438b3656a6375d15529df52
BLAKE2b-256 1e9caf2a4cf10ccc61a65a39899f0cb74b1a4a08beb1c9fcdf4003300adae03f

See more details on using hashes here.

File details

Details for the file farstail-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: farstail-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for farstail-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ac0f66c182c959e16f8f6e99bb587e1905e949ecc2fafb05b826bccf4f6b32da
MD5 d965b26566fce2e3e6863f3cdc69df78
BLAKE2b-256 723a83d5b4b8ff5b5014b75e193edfb7a2f280b9c354be11c600ded904df6f83

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page