Skip to main content

Utilities to help build Workbench modules

Project description

Parsers for CJWorkbench modules.

Workbench modules may optionally depend on the latest version of this Python package for its cjwparse.api.parse_file() function.


This is meant to be used within a Docker container. It depends on executables /usr/bin/(csv|json|xls|xlsx)-to-arrow.

Your Dockerfile might look something like this:

FROM arrow-tools:v0.0.11 AS arrow-tools
FROM python:3.8.1-buster AS main

COPY --from=arrow-tools /usr/bin/csv-to-arrow /usr/bin/csv-to-arrow
COPY --from=arrow-tools /usr/bin/json-to-arrow /usr/bin/json-to-arrow
COPY --from=arrow-tools /usr/bin/xls-to-arrow /usr/bin/xls-to-arrow
COPY --from=arrow-tools /usr/bin/xlsx-to-arrow /usr/bin/xlsx-to-arrow

# And now that these binaries are accessible, you can install cjwparse...


import pyarrow

from cjwparse.api import parse_file

# Convert a CSV file 'input.csv' to Arrow file 'output.arrow'
input_path = Path("input.csv")
output_path = Path("output.arrow")
parse_file(input_path, output_path=output_path, has_headers=False)

# Read the output
with pyarrow.ipc.open_file(output_path) as reader:
    table = reader.read_all()


  1. Run tests: docker build .
  2. Write a failing unit test in tests/
  3. Make it pass by editing code in cjwparse/
  4. black cjwparse maintenance tests && isort cjwparse maintenance tests && python3 ./ extract_messages
  5. Submit a pull request

Be very, very, very careful to preserve a consistent API. Workbench will upgrade this dependency without module authors' explicit consent. Add new features; fix bugs. Never change functionality.


Marking strings for translation

Strings in cjwparse can be marked for translation using cjwparse.i18n._trans_cjwparse. Each translation message must have a (unique) ID. ICU is supported for the content. For example:

from .i18n import _trans_cjwparse

err = "Error 404"

with_arguments = _trans_cjwparse(
    "Something is wrong: {error}",
    {"error": err}

without_arguments = _trans_cjwparse(
    "Something is wrong",

Workbench is wired to accept the return value of _trans_cjwparse wherever an error/warning or quick fix is expected.

Creating po catalogs

Calls to _trans_cjwparse can (and must) be parsed to create cjwparse's .po files. Update the .po files with:

./ extract_messages

The first time you run this, you'll need to install dependencies: pip3 install .[maintenance]

Unit testing

In case a _trans_cjwparse invocation needs to be unit tested, you can use cjwparse.testing.i18n.cjwparse_i18n_message in a manner similar to the following:

from cjwparse.testing.i18n import cjwparse_i18n_message
import with_arguments, without_arguments

assert with_arguments == cjwparse_i18n_message("greatapi.exception.message", {"error": "Error 404"})
assert without_arguments == cjwparse_i18n_message("greatapi.exception.general")

Message deprecation

Never delete a trans() call: each message ID, once assigned, must be preserved forever.

When there is no more code path to a trans() call, move it to cjwparse/i18n/ The file is only read by extraction code. It is never executed.


  1. Write a new __version__ to cjwparse/ Adhere to semver. (As changes must be backwards-compatible, the version will always start with 1 and look like 1.x.y.)
  2. Prepend notes to about the new version
  3. git commit
  4. git tag v1.x.y
  5. git push --tags && git push
  6. Wait for Travis to push our changes to PyPI

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjwparse-2.0.2.tar.gz (19.6 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page