Skip to main content

Italian ATS Evaluator

Project description

italian-ats-evalautor

This is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.

You can analyze a single text extracting the following features:

  • Overall:
    • Number of tokens
    • Number of tokens (including punctuation)
    • Number of characters
    • Number of characters (including punctuation)
    • Number of words
    • Number of syllables
    • Number of unique lemmas
    • Number of sentences
  • Part of Speech (POS) distribution
  • Verbs distribution
    • Active Verbs
    • Passive Verbs
    • Reflective Verbs
  • Lexicon:
  • Readability:
    • Type-Token Ratio (TTR)
    • Gulpease Index
    • Flesch-Vacca Index
    • Lexical Density

You can also compare two texts and get the following metrics:

  • Semantic:
    • Semantic Similarity
  • Character diff:
    • Edit Distance
  • Token diff:
    • Amount of tokens added
    • Amount of tokens removed
    • Amount of VdB tokens removed
    • Amount of VdB tokens added

Installation

pip install italian-ats-evaluator

Usage

Create the TextAnalyzer and SimplificationAnalyzer objects with the desired models.

from italian_ats_evaluator import TextAnalyzer
from italian_ats_evaluator import SimplificationAnalyzer

text_analyzer = TextAnalyzer(
    spacy_model_name="it_core_news_lg"
)

simplification_analyzer = SimplificationAnalyzer(
    spacy_model_name="it_core_news_lg",
    sentence_transformers_model_name="intfloat/multilingual-e5-base"
)

Call the analyze method on the TextAnalyzer object to evaluate the features of a text.

text_evaluation = text_analyzer.analyze("Il gatto mangia il topo.")
print(text_evaluation)

Call the analyze method on the SimplificationAnalyzer object to evaluate the features of two texts.

simplification_evaluation = simplification_analyzer.analyze(
    reference_text="Il felino mangia il roditore",
    simplified_text="Il gatto mangia il topo"
)
print(simplification_evaluation)

Development

Create a virtual environment

python3 -m venv venv
source venv/bin/activate

Install the package in editable mode

pip install -e .

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Acknowledgements

This contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) “VerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility” (Prot. 2020BJKB9M), funded by the Italian Ministero dell’Università e della Ricerca.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

italian_ats_evaluator-3.0.0.tar.gz (161.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

italian_ats_evaluator-3.0.0-py3-none-any.whl (166.3 kB view details)

Uploaded Python 3

File details

Details for the file italian_ats_evaluator-3.0.0.tar.gz.

File metadata

  • Download URL: italian_ats_evaluator-3.0.0.tar.gz
  • Upload date:
  • Size: 161.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for italian_ats_evaluator-3.0.0.tar.gz
Algorithm Hash digest
SHA256 15e91742e960c162f3c0a3749d467d8f20b815c7111fa369baf0036b37910c2a
MD5 7ebc0e53899d67e5b8411f38b5a91ed1
BLAKE2b-256 205c6be6c27fb963bb10c7525bfbfe6f494a42f8282b9a0bcae5561403fd1eba

See more details on using hashes here.

File details

Details for the file italian_ats_evaluator-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for italian_ats_evaluator-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d78376e918350c0284a03dd8fff90b30244e0b7f9561c3254bda6cfce292e5a
MD5 0a8f2db2f63af90eb03659a1eb5cd86b
BLAKE2b-256 696ac73252d12bd446385570865b1538ab5844d9c6d989537179eacd9fe709c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page