Skip to main content

Python modules and scripts for working with Concrete

Project description

Copyright 2012-2017 Johns Hopkins University HLTCOE. All rights reserved. This software is released under the 2-clause BSD license. Please see LICENSE for more information.

Concrete-Python

Concrete-Python is the Python interface to Concrete, an HLT data specification defined using Thrift.

Concrete-Python contains generated Python classes and additional utilities. It does not contain the Thrift schema for Concrete, which can be found in the Concrete GitHub repository.

Requirements

Concrete-Python requires Python 2.7 and the Thrift Python library, among other Python libraries. These are installed automatically by setup.py or pip. The Thrift compiler is not required.

Installation

You can install Concrete using the pip package manager:

pip install concrete

or by cloning the repository and running setup.py:

git clone https://github.com/hltcoe/concrete-python.git
cd concrete-python
python setup.py test
python setup.py install

Useful Scripts

The Concrete Python package comes with three scripts:

concrete_inspect.py

reads in a Concrete Communication and prints out human-readable information about the Communication’s contents (such as tokens, POS and NER tags, Entities, Situations, etc) to stdout. This script is a command-line wrapper around the functionality in the concrete.inspect library.

concrete2json.py

reads in a Concrete Communication and prints a JSON version of the Communication to stdout. The JSON is “pretty printed” with indentation and whitespace, which makes the JSON easier to read and to use for diffs.

validate_communication.py

reads in a Concrete Communication file and prints out information about any invalid fields. This script is a command-line wrapper around the functionality in the concrete.validate library.

Use the --help flag for details about the scripts’ command line arguments.

Using the code in your project

Concrete types are located under the ttypes module of their respective namespace in the schema. To import and use Communication, for example:

from concrete.communication.ttypes import Communication

foo = Communication()
foo.text = 'hello world'

Validating Concrete Communications

The Python version of the Thrift Libraries does not perform any validation of Thrift objects. You should use the validate_communication() function after reading and before writing a Concrete Communication:

from concrete.util import read_communication_from_file
from concrete.validate import validate_communication

comm = read_communication_from_file('tests/testdata/serif_dog-bites-man.concrete')

# Returns True|False, logs details using Python stdlib 'logging' module
validate_communication(comm)

Thrift fields have three levels of requiredness:

  • explicitly labeled as required

  • explicitly labeled as optional

  • no requiredness label given (“default required”)

Other Concrete tools will raise an exception if a required field is missing on deserialization or serialization, and will raise an exception if a “default required” field is missing on serialization. By default, Concrete-Python does not perform any validation of Thrift objects on serialization or deserialization. The Python Thrift classes do provide shallow validate() methods, but they only check for explicitly required fields (not “default required” fields) and do not validate nested objects.

The validate_communication() function recursively checks a Communication object for required fields, plus additional checks for UUID mismatches.

Development

Please see CONTRIBUTING.rst for information about contributing to Concrete-Python.

Contributors

  • Craig Harman

  • Low Kian Seong

  • Frank Ferraro

  • Max Thomas

  • Adrian Benton

  • Joel Coffman

  • Chandler May

  • Tom Lippincott

Please contact us if you have contributed to Concrete-Python but are not on this list.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concrete-4.12.4.tar.gz (174.6 kB view details)

Uploaded Source

File details

Details for the file concrete-4.12.4.tar.gz.

File metadata

  • Download URL: concrete-4.12.4.tar.gz
  • Upload date:
  • Size: 174.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for concrete-4.12.4.tar.gz
Algorithm Hash digest
SHA256 b6821fda730503281ca63233f63836a2d50fb87c703ca34bbd3921002b5e5a95
MD5 aa5402f5a461f69dd22afd980b076249
BLAKE2b-256 fd09af8cf4b167fe9e1a9913454dfdb10803641b4b4e184b9fdc721be3860685

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page