Evaluates the linguistic and structural quality of scientific texts.
Project description
# Confopy
Asserting the linguistic and structural quality of scientific texts. Written in Python.
Name origin: Confopy := Conform + Python
# Installation
sudo apt-get install python-pdfminer
# lxml==2.3.2
sudo pip install -U lxml
sudo pip install -U numpy
sudo pip install -U pyyaml nltk
sudo pip install -U pyenchant # spell checking
sudo pip install -U pattern
#sudo pip install -U pyparsing # for nltk_contrib
# Install nltk_contrib:
#cd confopy/contrib/nltk_contrib
#python setup.py build
#sudo python setup.py install
# Python 3
* The package python-pdfminer only works with python 2.4 or newer, but not with python 3
# Unicode errors
* Configure terminal to use unicode!
* For Python devs:
http://docs.python.org/2/howto/unicode.html#the-unicode-type
* Convert the TIGER Treebank Version 1 file
"tiger_release_july03.penn"
to utf-8 encoding before using Confopy!
Asserting the linguistic and structural quality of scientific texts. Written in Python.
Name origin: Confopy := Conform + Python
# Installation
sudo apt-get install python-pdfminer
# lxml==2.3.2
sudo pip install -U lxml
sudo pip install -U numpy
sudo pip install -U pyyaml nltk
sudo pip install -U pyenchant # spell checking
sudo pip install -U pattern
#sudo pip install -U pyparsing # for nltk_contrib
# Install nltk_contrib:
#cd confopy/contrib/nltk_contrib
#python setup.py build
#sudo python setup.py install
# Python 3
* The package python-pdfminer only works with python 2.4 or newer, but not with python 3
# Unicode errors
* Configure terminal to use unicode!
* For Python devs:
http://docs.python.org/2/howto/unicode.html#the-unicode-type
* Convert the TIGER Treebank Version 1 file
"tiger_release_july03.penn"
to utf-8 encoding before using Confopy!