Skip to main content

Cross-lingual QASem parser for predicate-centered question-answer structures.

Project description

PyPI version

XQASem

XQASem is a Python package for extracting predicate-centered question-answer structures from text. It currently includes presets for French, Russian, and Hebrew models hosted on HuggingFace.

This package accompanies the paper Effective QA-Driven Annotation of Predicate-Argument Relations Across Languages.

Installation

pip install xqasem

Install the spaCy pipelines you plan to use:

python -m spacy download fr_core_news_md
python -m spacy download ru_core_news_sm

Hebrew uses spacy-stanza; on first use, Stanza may need to download its Hebrew resources.

Environment

This project was tested with the following setup:

  • Python 3.10.20
  • torch 2.6.0 (CUDA 12.4)
  • transformers 4.57.1
  • spaCy 3.7.5

Requirements

  • Python 3.10+
  • transformers >= 4.50
  • spaCy >= 3.7
  • torch >= 2.0

Note: GPU is recommended for efficient inference.

Installation from source

git clone https://github.com/JohnnieDavidov/xqasem.git
cd xqasem
pip install -e .

Basic Usage

Example

from xqasem import XQasemParser

parser = XQasemParser.from_language("fr")

sentences = [
    "Les experts ont souligné que le nouvel algorithme accélère considérablement le traitement des requêtes complexes."
]

df = parser(sentences)

print(df)

Example output:

sentence predicate predicate_type question answer
Les experts ont souligné que le nouvel algorithme accélère considérablement le traitement des requêtes complexes. souligné verb qui a souligné quelque chose? Les experts
Les experts ont souligné que le nouvel algorithme accélère considérablement le traitement des requêtes complexes. accélère verb qu'est-ce qui accélère quelque chose? le nouvel algorithme
Les experts ont souligné que le nouvel algorithme accélère considérablement le traitement des requêtes complexes. accélère verb qu'est-ce que quelque chose accélère? le traitement des requêtes complexes

The built-in language presets are:

parser_fr = XQasemParser.from_language("fr")
parser_ru = XQasemParser.from_language("ru")
parser_he = XQasemParser.from_language("he")

You can also load an explicit Hugging Face model:

parser = XQasemParser.from_pretrained(
    "YonatanDavidov/qasem-fr-claire-lora",
    spacy_lang="fr_core_news_md",
    is_adapter=True,
)

Command Line

After installation, you can run:

xqasem-parse --lang fr --output outputs/fr.csv "Les experts ont souligné que le nouvel algorithme accélère le traitement."

You can also run the script directly:

python scripts/run_parser.py --lang fr --output outputs/fr.csv

Model Presets

Language Model spaCy pipeline
French YonatanDavidov/qasem-fr-claire-lora fr_core_news_md
Russian YonatanDavidov/qasem-ru-sambalingo-lora ru_core_news_sm
Hebrew YonatanDavidov/qasem-he-dictalm2-lora he via spacy-stanza
Hebrew full model YonatanDavidov/qasem-he-dictalm2-full he via spacy-stanza

Citation

If you use this code or the released models, please cite:

@inproceedings{davidov-etal-2026-effective,
    title = "Effective {QA}-Driven Annotation of Predicate{--}Argument Relations Across Languages",
    author = "Davidov, Jonathan  and
      Slobodkin, Aviv  and
      Klein, Shmuel Tomi  and
      Tsarfaty, Reut  and
      Dagan, Ido  and
      Klein, Ayal",
    editor = "Demberg, Vera  and
      Inui, Kentaro  and
      Marquez, Llu{\\'i}s",
    booktitle = "Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 1: Long Papers)",
    month = mar,
    year = "2026",
    address = "Rabat, Morocco",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.eacl-long.112/",
    doi = "10.18653/v1/2026.eacl-long.112",
    pages = "2484--2502",
    ISBN = "979-8-89176-380-7"
}

Acknowledgments

This implementation builds on ideas and code structure from Paul Roit's qasem_parser, which provides the original English QA-Sem parsing framework that inspired this multilingual extension.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xqasem-0.1.2.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xqasem-0.1.2-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file xqasem-0.1.2.tar.gz.

File metadata

  • Download URL: xqasem-0.1.2.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for xqasem-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ad59aac319526eddf7ba912226b8b1040c8ed0e7dbf4cd3444e1886d47427ae6
MD5 9b7fbc6adc568d243f0f43c3f06be164
BLAKE2b-256 533519d56691822f1cad787b756509a80a4fa7e36d4ec4a5cfbed9a8e877a500

See more details on using hashes here.

File details

Details for the file xqasem-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: xqasem-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for xqasem-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0fdf2cc840b5a3569e4fbd9dc3968f18637e96aae25c113b76d040f5aa2b26c4
MD5 1045bfbfee427be3eb4c1a727df64b20
BLAKE2b-256 12cc8a30ed03e9e808a1e7bbb67b36069f622b3ec135c8bd65c2050254fc2910

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page