A Python Wrapper to calculate standard BLEU scores for NLP
Project description
bleu (Python Package)
A Python Wrapper for the standard BLEU evaluation for Natural Language Generation (NLG).
- GitHub project: https://github.com/zhijing-jin/bleu.
- PyPI package:
pip install
bleu
Installation
Requirement: Python 3
Option 1: Install pip package
pip install --upgrade bleu
Option 2: Build from source
pip install --upgrade git+git://github.com/zhijing-jin/bleu.git
How to Run
The most standard way to calculate BLEU is by Moses' script for detokenized BLEU. This package provides easy calls to it.
Function 1: Calculate the BLEU for lists
If you want to check only one hypothesis (a list of sentences):
>>> from bleu import list_bleu >>> ref = ['it is a white cat .', 'wow , this dog is huge .'] >>> ref1 = ['This cat is white .', 'wow , this is a huge dog .'] >>> hyp = ['it is a white kitten .', 'wowww , the dog is huge !'] >>> hyp1 = ["it 's a white kitten .", 'wow , this dog is huge !'] >>> list_bleu([ref], hyp) 34.99 >>> list_bleu([ref, ref1], hyp1) 57.91
If you want to check multiple hypothesis (several lists of sentences):
>>> from bleu import multi_list_bleu >>> multi_list_bleu([ref, ref1], [hyp, hyp1]) [34.99, 57.91]
detok=False
: It is not advisable to use tokenized bleu (by multi-bleu.perl), but if you want to call it, just use detok=False
:
>>> list_bleu([ref], hyp, detok=False) 39.76 # or if you want to test multiple hypotheses >>> multi_list_bleu([ref, ref1], [hyp, hyp1], detok=False) [39.76, 47.47]
verbose=True
: If there are unexpected errors, you might want to check the intermediate steps by verbose=True
.
Function 2: Calculate the BLEU for files
If you want to check only one hypothesis file:
# if you already have the following files >>> from bleu import file_bleu >>> hyp_file = 'data/hyp0.txt' >>> ref_files = ['data/ref0.txt', 'data/ref1.txt'] >>> file_bleu(ref_files, hyp_file) 34.99
If you want to check multiple hypothesis files:
>>> from bleu import multi_file_bleu >>> hyp_file1 = 'data/hyp1.txt' >>> bleus = multi_file_bleu(ref_files, [hyp_file, hyp_file1]) [34.99, 57.91]
detok=True
: Set it if you want to calculate the (not recommended) tokenized bleu.
verbose=True
: Set it if you want to inspect how the bleu calculations are made:
>>> bleu = file_bleu(ref_files, hyp_file, verbose=True) [Info] Valid Reference Files: ['data/ref0.txt', 'data/ref1.txt'] [Info] Valid Hypothesis Files: ['data/hyp0.txt'] [Info] #lines in each file: 2 [cmd] perl detokenizer.perl -l en < data/ref0.txt > data/ref0.detok.txt 2>/dev/null [cmd] perl detokenizer.perl -l en < data/ref1.txt > data/ref1.detok.txt 2>/dev/null [cmd] perl detokenizer.perl -l en < data/hyp0.txt > data/hyp0.detok.txt 2>/dev/null [cmd] perl multi-bleu-detok.perl data/ref0.detok.txt data/ref1.detok.txt < data/hyp0.detok.txt 2-ref bleu for data/hyp0.detok.txt: 34.99 >>> bleu 34.99
Option 3: Detokenize files
>>> from bleu import detok_files >>> detok_ref_files = detok_files(ref_files, tmp_dir='./data', file_prefix='ref_dtk', verbose=True) [cmd] perl ./TMP_DIR/detokenizer.perl -l en < data/ref0.txt > data/ref_dtk0.txt 2>/dev/null [cmd] perl ./TMP_DIR/detokenizer.perl -l en < data/ref1.txt > data/ref_dtk1.txt 2>/dev/null >>> detok_ref_files ['data/ref_dtk0.txt', 'data/ref_dtk1.txt']
In Case of Unexpected Outputs
Check the python file bleu.py and adapt it.
Contact
If you have more questions, feel free to check out the common Q&A, or raise a new GitHub issue.
In case of really urgent needs, contact the author Zhijing Jin (Miss).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size bleu-0.3.tar.gz (5.2 kB) | File type Source | Python version None | Upload date | Hashes View |