Skip to main content

Italian ATS Evaluator

Project description

italian-ats-evalautor

This is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.

You can analyze a single text extracting the following features:

  • Overall:
    • Number of tokens
    • Number of tokens (including punctuation)
    • Number of characters
    • Number of characters (including punctuation)
    • Number of words
    • Number of syllables
    • Number of unique lemmas
    • Number of sentences
  • Part of Speech (POS) distribution
  • Verbs distribution
    • Active Verbs
    • Passive Verbs
    • Reflective Verbs
  • Lexicon:
  • Readability:
    • Type-Token Ratio (TTR)
    • Gulpease Index
    • Flesch-Vacca Index
    • Lexical Density

You can also compare two texts and get the following metrics:

  • Semantic:
    • Semantic Similarity
  • Character diff:
    • Edit Distance
  • Token diff:
    • Amount of tokens added
    • Amount of tokens removed
    • Amount of VdB tokens removed
    • Amount of VdB tokens added

Installation

pip install italian-ats-evaluator

Usage

Create the TextAnalyzer and SimplificationAnalyzer objects with the desired models.

from italian_ats_evaluator import TextAnalyzer
from italian_ats_evaluator import SimplificationAnalyzer

text_analyzer = TextAnalyzer(
    spacy_model_name="it_core_news_lg"
)

simplification_analyzer = SimplificationAnalyzer(
    spacy_model_name="it_core_news_lg",
    sentence_transformers_model_name="intfloat/multilingual-e5-base"
)

Call the analyze method on the TextAnalyzer object to evaluate the features of a text.

text_evaluation = text_analyzer.analyze("Il gatto mangia il topo.")
print(text_evaluation)

Call the analyze method on the SimplificationAnalyzer object to evaluate the features of two texts.

simplification_evaluation = simplification_analyzer.analyze(
    reference_text="Il felino mangia il roditore",
    simplified_text="Il gatto mangia il topo"
)
print(simplification_evaluation)

Development

Create a virtual environment

python3 -m venv venv
source venv/bin/activate

Install the package in editable mode

pip install -e .

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Acknowledgements

This contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) “VerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility” (Prot. 2020BJKB9M), funded by the Italian Ministero dell’Università e della Ricerca.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

italian_ats_evaluator-3.0.1.tar.gz (161.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

italian_ats_evaluator-3.0.1-py3-none-any.whl (166.3 kB view details)

Uploaded Python 3

File details

Details for the file italian_ats_evaluator-3.0.1.tar.gz.

File metadata

  • Download URL: italian_ats_evaluator-3.0.1.tar.gz
  • Upload date:
  • Size: 161.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for italian_ats_evaluator-3.0.1.tar.gz
Algorithm Hash digest
SHA256 e2376819eb9cffa173988542bdef35a7d070428fc629ed944b067152119d8006
MD5 3ec2e8ab748600831ebd13c65e2b9a3f
BLAKE2b-256 858fc6fe0e8a068e7b47ff6a4fa1d2cfc829764e7012256f1b5dd0278a3aec04

See more details on using hashes here.

File details

Details for the file italian_ats_evaluator-3.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for italian_ats_evaluator-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2dd14c81525193c0f05a1045cf961adb3fc6f60441647d241824e82ae5df8b6b
MD5 181811d094a0857f3823e1d2579fd5ac
BLAKE2b-256 a5909041dbdd4a48efc9d26981048fe73995abf53ba2c3305a14be7b610af5e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page