Skip to main content

A mixed bag of Python3 tools for PMA workflow

Project description

PMIX: Questionnaire Language Utilities

A mixed bag of PMA2020 utilities. There are several functionalities all based on working with spreadsheets. The main features are the following:

Formerly qlang, this package has been renamed and expanded to provide new functionality and new command-line tools. The command line tools are described after installation.

This version requires Python 3 or later. Python 2 is not supported.

Installation

This package is on PyPI! Run:

python3 -m pip install pmix

For developers, to install from Github, run:

python3 -m pip install https://github.com/PMA-2020/pmix/zipball/master

Analytics

Usage

python3 -m pmix.analytics FILE1 [FILE2 ...]

creates a JSON file describing the prompts and fields for analytics.

Borrow

The purpose of the Pmix Borrow module use to assist with translation management of ODK forms. It is especially useful for merging translations from one file into another.

Command Line Usage

This module is called with

python3 -m pmix.borrow

and it does two things. Without the -m argument, it simply creates a translation dictionary. The source string is in the first column, and the target languages are in the subsequent columns. With the -m argument, it creates a translation dictionary and then merges those translations into the file specified by -m.

Examples
  1. Without -m,
python3 -m pmix.borrow FILE1 [FILE2 ...]

creates a translation dictionary from FILE1 [FILE2 ...].

  1. With -m,
python3 -m pmix.borrow -m TARGET FILE1 [FILE2 ...]

creates a translation dictionary from FILE1 [FILE2 ...] and then merges into TARGET.

In both examples, a default output filename is used, but one can be specified with the -o argument.

The Input File

The input file can be 1 of 2 kinds:

  1. A standard ODK file.
  2. A raw translations file.

A raw translations file has the following form, using English and French as examples:

text::English text::Français ... text::<language n>
Hello! Bonjour! ... <"Hello!" in language n>

Diverse translations

There are a set of command-line options to work with diverse translations.

  • -D This option, used without argument, means if text has diverse translations, do not borrow it. Only has effect with -m
  • -C CORRECT This option marks a file as correct. Fill in CORRECT with a path to a source file. Its translations are given precedence over others. If there is only one input file, and it is correct, then there is no need to mark it correct because nothing can override it.
  • d DIVERSE Give a language found in the forms for DIVERSE. This option is used without -m. It creates a file with only strings that have diverse translations in the supplied language from the source files.

The Output File

A resultant file with merged translations has the following possible highlighting:

  • #ffd3b6 Orange if the source and the translation are the same.
  • #9acedf Blue if the new translation changes the old translation.
  • #d3d3d3 Grey if the new translation fills in a previously missing translation (blank cell).
  • #85ca5d Green if the translation is not found in the TranslationDict, but there is a pre-existing translation.
  • #ffaaa5 Red if translation is not found and there is no pre-existing translation.
  • #fffa81 Yellow if using the -D option, shows strings that have diverse translations without inserting them.
  • #ffffff No highlight if the translation is the same as the pre-existing translation.

Cascade

Usage

python3 -m pmix.cascade FILE

creates a new Excel spreadsheet after converting geographic identifiers from wide format to tall format.

Numbering

Use the numbering mini-language and create question numbers for an ODK survey.

python3 -m pmix.numbering FILE

The program then looks for a column entitled "N" in the "survey" worksheet. It creates numbers based off of the directives there and adds them to label columns.

Workbook

There following features are offered:

  1. Convert a worksheet to CSV with UTF-8 encoding and UNIX-style newlines.
python3 -m pmix.workbook FILE -c SHEET
  1. Remove all trailing and leading whitespace from all text cells
python3 -m pmix.workbook FILE -w

XlsDiff

A utility for showing the differences between two Excel files.

python3 -m pmix.xlsdiff FILE1 FILE2 --excel

The above command creates a new Excel file, creating a new version of FILE2 with highlighting to show differences.

#ff0000 Red -- Rows and columns that are duplicate so are not compared
#FFD3B9 Orange/Peach -- Rows and columns that are in the marked up file (FILE2), but not in the other
#FFF78E Light Yellow -- Cells that are different between the the two files
#00ff00 Green -- Rows that are in a changed order

XlsDiff

Options
Short Flag Long Flag Description
-h --help Show this help message and exit.
-r --reverse Reverse the order of the base file and the new file for processing.
-s --simple Do a simple diff instead of the default ODK diff.
-e --excel Path to write Excel output. If flag is given with no argument then default out path is used. If flag is omitted, then write text output to STDOUT.

Viffer

Viffer is a tool that provides a tabulated report on the differences between two XlsForms. Viffer stands for "Version Diff'er".

XlsDiff

This tool is currently under development under another fork of pmix. If interested in using it, please see: https://github.com/joeflack4/pmix/tree/feature_viffer#viffer

Example Usage

Generate a Viffer report. python -m pmix.viffer old_form.xlsx new_form.xlsx

Bugs

Submit bug reports to James Pringle at jpringleBEAR@jhu.edu minus the bear.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pmix-0.5.0.tar.gz (40.1 kB view hashes)

Uploaded Source

Built Distribution

pmix-0.5.0-py3-none-any.whl (41.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page