Python ROUGE Score Implementation for Chinese Language Task (official rouge score)

These details have not been verified by PyPI

Project links

Homepage

Project description

Rouge

A full Python librarie for the ROUGE metric (paper).

Difference

This library based on the code from pltrdy. Using the original code to compute rouge score in Chinese would meet some problems. For example, the stack overflow issue would occur and the Chinese sentences are not splited correctly. This code solves these problems and generates more accurate rouge scores in Chinese NLP tasks.

Changed the sentence cutting mechanism. Original code would split sentences only by '.'. The rouge-chinese would split sentences regarding Chinese punctuation in a more logical way.
Optimized memory usage in rouge-L score calculation. The new code did not generate longest common sequence since most of users did not need it. This part would be extremely memory costly since it contains iterative algorithm which would create lots of stacks. The new code could calculate the length of the longest common sequence without generating them.
More accurate rouge scores. The original code replaced 'official' rouge-L scores with union rouge-L scores, which would certainly give users different results. Thanks to the memory optimization, the new code could give users 'official' rouge scores.

Quickstart

Clone & Install

git clone https://github.com/Isaac-JL-Chen/rouge_chinese.git
cd rouge_chinese
python setup.py install
# or
pip install -U .

or from pip:

pip install rouge-chinese

Use it from the shell (JSON Output)

$rouge -h
usage: rouge [-h] [-f] [-a] hypothesis reference

Rouge Metric Calculator

positional arguments:
  hypothesis  Text of file path
  reference   Text or file path

optional arguments:
  -h, --help  show this help message and exit
  -f, --file  File mode
  -a, --avg   Average mode

e.g.

# Single Sentence
rouge "transcript is a written version of each day 's cnn student" \
      "this page includes the show transcript use the transcript to help students with"

# Scoring using two files (line by line)
rouge -f ./tests/hyp.txt ./ref.txt

# Avg scoring - 2 files
rouge -f ./tests/hyp.txt ./ref.txt --avg

As a library

Score 1 sentence

from rouge import Rouge 

hypothesis = "the #### transcript is a written version of each day 's cnn student news program use this transcript to he    lp students with reading comprehension and vocabulary use the weekly newsquiz to test your knowledge of storie s you     saw on cnn student news"

reference = "this page includes the show transcript use the transcript to help students with reading comprehension and     vocabulary at the bottom of the page , comment for a chance to be mentioned on cnn student news . you must be a teac    her or a student age # # or older to request a mention on the cnn student news roll call . the weekly newsquiz tests     students ' knowledge of even ts in the news"

rouge = Rouge()
scores = rouge.get_scores(hypothesis, reference)

Output:

[
  {
    "rouge-1": {
      "f": 0.4786324739396596,
      "p": 0.6363636363636364,
      "r": 0.3835616438356164
    },
    "rouge-2": {
      "f": 0.2608695605353498,
      "p": 0.3488372093023256,
      "r": 0.20833333333333334
    },
    "rouge-l": {
      "f": 0.44705881864636676,
      "p": 0.5277777777777778,
      "r": 0.3877551020408163
    }
  }
]

Note: "f" stands for f1_score, "p" stands for precision, "r" stands for recall.

Score multiple sentences

import json
from rouge import Rouge

# Load some sentences
with open('./tests/data.json') as f:
  data = json.load(f)

hyps, refs = map(list, zip(*[[d['hyp'], d['ref']] for d in data]))
rouge = Rouge()
scores = rouge.get_scores(hyps, refs)
# or
scores = rouge.get_scores(hyps, refs, avg=True)

Output (avg=False): a list of n dicts:

[{"rouge-1": {"f": _, "p": _, "r": _}, "rouge-2" : { .. }, "rouge-l": { ... }}]

Output (avg=True): a single dict with average values:

{"rouge-1": {"f": _, "p": _, "r": _}, "rouge-2" : { ..     }, "rouge-l": { ... }}

Score two files (line by line)

Given two files hyp_path, ref_path, with the same number (n) of lines, calculate score for each of this lines, or, the average over the whole file.

from rouge import FilesRouge

files_rouge = FilesRouge()
scores = files_rouge.get_scores(hyp_path, ref_path)
# or
scores = files_rouge.get_scores(hyp_path, ref_path, avg=True)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.3

Sep 18, 2022

This version

1.0.2

Sep 18, 2022

1.0.1

Sep 18, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rouge_chinese-1.0.2.tar.gz (18.1 kB view details)

Uploaded Sep 18, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rouge_chinese-1.0.2-py3-none-any.whl (20.2 kB view details)

Uploaded Sep 18, 2022 Python 3

File details

Details for the file rouge_chinese-1.0.2.tar.gz.

File metadata

Download URL: rouge_chinese-1.0.2.tar.gz
Upload date: Sep 18, 2022
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for rouge_chinese-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`106d390c173c6ece82dd6a15865b44ab92b5035887c728aef19503b5d4048928`
MD5	`8dc6e9eca471d8f79718b8fd18e67f15`
BLAKE2b-256	`51c6836782f84fb722ba90a209ba23e864d6fb8c3d779c4553c10465a8cfe597`

See more details on using hashes here.

File details

Details for the file rouge_chinese-1.0.2-py3-none-any.whl.

File metadata

Download URL: rouge_chinese-1.0.2-py3-none-any.whl
Upload date: Sep 18, 2022
Size: 20.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for rouge_chinese-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ec2eb9454cd9fb362ec081baa445f76997900ef6fa69455ccb13e87fafe13fa`
MD5	`5065927720fdee75658728254619084d`
BLAKE2b-256	`ee180703555e939880a1a137622b816540027bc1c687eecde5aadd17d47e0ed5`

See more details on using hashes here.

rouge-chinese 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Rouge

Difference

Quickstart

Clone & Install

Use it from the shell (JSON Output)

As a library

Score 1 sentence

Score multiple sentences

Score two files (line by line)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes