Skip to main content

The implementation of the ExPerT score.

Project description

ExPerT: Effective and Explainable Evaluation of Personalized Long-Form Text Generation

Evaluating personalized text generated by large language models (LLMs) is challenging, as only the LLM user, i.e. prompt author, can reliably assess the output, but re-engaging the same individuals across studies is infeasible. This paper addresses the challenge of evaluating personalized text generation by introducing ExPerT, an explainable reference-based evaluation framework. ExPerT leverages an LLM to extract atomic aspects and their evidences from the generated and reference texts, match the aspects, and evaluate their alignment based on content and writing style—two key attributes in personalized text generation. Additionally, ExPerT generates detailed, fine-grained explanations for every step of the evaluation process, enhancing transparency and interpretability. Our experiments demonstrate that ExPerT achieves a 7.2% relative improvement in alignment with human judgments compared to the state-of-the-art text generation evaluation methods. Furthermore, human evaluators rated the usability of ExPerT's explanations at 4.7 out of 5, highlighting its effectiveness in making evaluation decisions more interpretable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

expert_score-0.0.1.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

expert_score-0.0.1-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file expert_score-0.0.1.tar.gz.

File metadata

  • Download URL: expert_score-0.0.1.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for expert_score-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b528c60106c96be8636cb5ee91b8db352a9249ead04c6f917eab9d94f620ea60
MD5 b5af02c5e4b5b42b009c4b28a4830854
BLAKE2b-256 33aa450780801020371c03b208cddd1981b8d3101d142060819cf3979db0e587

See more details on using hashes here.

File details

Details for the file expert_score-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: expert_score-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for expert_score-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8819ca3067e544e65012770a64b8aa61184e23d91ab66132353e888864fa5d0c
MD5 01422f2dd94c5de0ae2774fd2c90d69e
BLAKE2b-256 989c78e6b3d8ad41ecaa8da5b2f64fc76484fd1bf04e5306b05ad7027922aae4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page