Confopy

Evaluates the linguistic and structural quality of scientific texts.

Project description

Confopy
=======

Asserts the linguistic and structural quality of scientific texts.

Confopy is a command-line tool that accepts one or multiple PDF documents and prints textual reports.
Currently it only works for German papers.

Name origin: Confopy := Conform + Python

Installation
============

Installation using pypi (preferred)
-----------------------------------

sudo pip install -U Confopy

Launch Confopy with

confopy --help
confopy -r document your_paper.pdf

Manual installation
-------------------

Dependencies:

sudo apt-get install python-pdfminer

sudo pip install -U lxml
sudo pip install numpy==1.6.2
sudo pip install pyyaml nltk==3.0.0
sudo pip install pyenchant==1.6.5
sudo pip install pattern==2.6

Launch Confopy with

python confopy/ --help
python confopy/ -r document your_paper.pdf

Usage
=====

$ confopy -h
usage: confopy [-h] [-l LANGUAGE] [-lx] [-ml] [-o OUTFILE] [-r REPORT] [-rl]
[-ul] [-vl] [-x]
[file [file ...]]

Language and structure checker for scientific documents.

positional arguments:
file Document file to analyze (PDF).

optional arguments:
-h, --help show this help message and exit
-l LANGUAGE, --language LANGUAGE
Language to use for PDF extraction and document
analysis. Default: de
-lx, --latex Tell the specified report to format output as LaTeX
(if supported by the report).
-ml, --metriclist Lists all available metrics by language and exits.
-o OUTFILE, --outfile OUTFILE
File to write the output too. Default: terminal
(stdout).
-r REPORT, --report REPORT
Analyses the given document according to the specified
report.
-rl, --reportlist Lists all available reports by language and exits.
-ul, --rulelist Lists all rules and exits.
-vl, --validate Validates a given XML against the XSD for the Confopy
data model.
-x, --xml Converts the PDF file(s) to Confopy XML (structure
orientated).

Getting a corpus
================

Confopy needs a corpus (collection of language data) to run.

For German (TIGER treebank):

Automated download:

1. Go to
<your python package directory>/confopy/localization/de/corpus\_de/
2. Execute the script
tiger_dl_patch.py
within that folder

Manual download:

1. Go to:
http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html
2. Accept the license and download TIGER-XML Release 2.2:
http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/download/tigercorpus-2.2.xml.tar.gz
3. Unpack the archive into confopy/localization/de/corpus\_de/
4. Run the patch tiger\_release\_aug07.corrected.16012013\_patch.py in the same folder
5. Verify that the generated file is named exactly like in confopy/config.py

Python 3
========

* The package python-pdfminer only works with python 2.4 or newer, but not with python 3

Known Issues and Workarounds
===============================

enchant.errors.DictNotFoundError: Dictionary for language 'de_DE' could not be found
------------------------------------------------------------------------------------

Install the German aspell package. E.g. on Ubuntu 16.04:

```
sudo apt install aspell-de
```

Unicode errors
--------------

* Configure terminal to use unicode!
* For Python devs:
http://docs.python.org/2/howto/unicode.html#the-unicode-type
* Convert the TIGER Treebank file
"tiger_release_aug07.corrected.16012013.xml"
to utf-8 encoding before using Confopy!

Project details

Release history Release notifications | RSS feed

This version

0.4.11

Nov 21, 2016

0.4.10

Nov 21, 2016

0.4.9

Nov 16, 2015

0.4.8.1

Oct 8, 2015

0.4.8

Oct 8, 2015

0.4.7

Dec 7, 2014

0.4.6

Nov 24, 2014

0.4.5.3

Nov 24, 2014

0.4.5.2

Nov 24, 2014

0.4.5.1

Nov 24, 2014

0.4.5

Nov 20, 2014

0.4.4

Nov 18, 2014

0.4.3

Nov 14, 2014

0.4.2

Nov 13, 2014

0.4.1

Nov 4, 2014

0.4.0

Nov 3, 2014

0.3.10

Oct 2, 2014

0.3.9

Sep 20, 2014

0.3.8

Sep 20, 2014

0.3.7

Sep 20, 2014

0.3.6

Sep 19, 2014

0.3.5

Sep 9, 2014

0.3.4

Sep 8, 2014

0.3.3

Sep 6, 2014

0.3.2

Aug 25, 2014

0.3.1

Aug 25, 2014

0.3.0

Aug 18, 2014

0.2.2

Aug 11, 2014

0.2.1

Aug 11, 2014

0.2.0

Jul 24, 2014

0.1.16

Jul 23, 2014

0.1.15

Jul 5, 2014

0.1.14

Jul 3, 2014

0.1.13

Jul 3, 2014

0.1.12

Jul 3, 2014

0.1.11

Jul 3, 2014

0.1.10

May 24, 2014

0.1.9

May 24, 2014

0.1.8

May 24, 2014

0.1.7

May 24, 2014

0.1.6

May 24, 2014

0.1.5

May 24, 2014

0.1.4

May 24, 2014

0.1.3

May 24, 2014

0.1.2

May 24, 2014

0.1.1

May 24, 2014

0.1

May 20, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Confopy-0.4.11.tar.gz (48.3 kB view details)

Uploaded Nov 21, 2016 Source

File details

Details for the file Confopy-0.4.11.tar.gz.

File metadata

Download URL: Confopy-0.4.11.tar.gz
Upload date: Nov 21, 2016
Size: 48.3 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for Confopy-0.4.11.tar.gz
Algorithm	Hash digest
SHA256	`5409185b0a5bd89837ed75f2c4378b9d889439878a67d4884644b9e7e2c82072`
MD5	`ead4cbb9facd4ed9ff24ffbf41b6f440`
BLAKE2b-256	`c3036d429315bd3382c89ce4c78b2bf460cf7db98b18b5225532a6107f1b7628`

See more details on using hashes here.

Confopy 0.4.11

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes