Skip to main content

Italian ATS Evaluator

Project description

italian-ats-evalautor

This is an open source project to evaluate the performance of an italian ATS (Automatic Text Simplifier) on a set of texts.

You can analyze a single text extracting the following features:

  • Overall:
    • Number of tokens
    • Number of tokens (including punctuation)
    • Number of characters
    • Number of characters (including punctuation)
    • Number of words
    • Number of syllables
    • Number of unique lemmas
    • Number of sentences
  • Part of Speech (POS) distribution
  • Verbs distribution
    • Active Verbs
    • Passive Verbs
    • Reflective Verbs
  • Lexicon:
  • Readability:
    • Type-Token Ratio (TTR)
    • Gulpease Index
    • Flesch-Vacca Index
    • Lexical Density

You can also compare two texts and get the following metrics:

  • Semantic:
    • Semantic Similarity
  • Character diff:
    • Edit Distance
  • Token diff:
    • Amount of tokens added
    • Amount of tokens removed
    • Amount of VdB tokens removed
    • Amount of VdB tokens added

Installation

pip install italian-ats-evaluator

Usage

Create the TextAnalyzer and SimplificationAnalyzer objects with the desired models.

from italian_ats_evaluator import TextAnalyzer
from italian_ats_evaluator import SimplificationAnalyzer

text_analyzer = TextAnalyzer(
    spacy_model_name="it_core_news_lg"
)

simplification_analyzer = SimplificationAnalyzer(
    spacy_model_name="it_core_news_lg",
    sentence_transformers_model_name="intfloat/multilingual-e5-base"
)

Call the analyze method on the TextAnalyzer object to evaluate the features of a text.

text_evaluation = text_analyzer.analyze("Il gatto mangia il topo.")
print(text_evaluation)

Call the analyze method on the SimplificationAnalyzer object to evaluate the features of two texts.

simplification_evaluation = simplification_analyzer.analyze(
    reference_text="Il felino mangia il roditore",
    simplified_text="Il gatto mangia il topo"
)
print(simplification_evaluation)

Development

Create a virtual environment

python3 -m venv venv
source venv/bin/activate

Install the package in editable mode

pip install -e .

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Acknowledgements

This contribution is a result of the research conducted within the framework of the PRIN 2020 (Progetti di Rilevante Interesse Nazionale) “VerbACxSS: on analytic verbs, complexity, synthetic verbs, and simplification. For accessibility” (Prot. 2020BJKB9M), funded by the Italian Ministero dell’Università e della Ricerca.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

italian_ats_evaluator-3.0.3.tar.gz (161.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

italian_ats_evaluator-3.0.3-py3-none-any.whl (166.3 kB view details)

Uploaded Python 3

File details

Details for the file italian_ats_evaluator-3.0.3.tar.gz.

File metadata

  • Download URL: italian_ats_evaluator-3.0.3.tar.gz
  • Upload date:
  • Size: 161.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for italian_ats_evaluator-3.0.3.tar.gz
Algorithm Hash digest
SHA256 f4a481c688e91d32f738a9f35c682c5be6bbe76c16d9fe871997f551ffb532c1
MD5 75496b2f17d52ccb5b06a2431cddc6c7
BLAKE2b-256 fd75ce3fd1a6281e0108cbb9f490ece43a8a030b22a98ddac083e00ddd2a1d32

See more details on using hashes here.

File details

Details for the file italian_ats_evaluator-3.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for italian_ats_evaluator-3.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a1a517c9d34fdeceb47612732663a32f67ca4f14b2857de22281c3f5ef9bf627
MD5 355fe2093942bda3ffefbefe3b2c02ec
BLAKE2b-256 c9449f176dafbd310c34722dd5055229e9c2e6bbd30c46c42baf7fadd94256d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page