COD CIF parser
Project description
A COD parser for CIF v1.1 and CIF v2.0 formats
Usage
datablocks, error_count, error_messages = parse( file, options )
Options
The COD CIF parser is designed to detect and report the most common CIF syntax errors. This is implemented using the extended grammar. The behaviour of the COD CIF parser is controlled by the following options:
fix_all: turns on all the following options;
fix_data_header: ignores stray CIF values before the first data block and missing data_ header;
fix_datablock_names: appends stray CIF values after the data block name to the data block name;
fix_duplicate_tags_with_same_values: ignores two or more data items having the same value in the same data block;
fix_duplicate_tags_with_empty_values: retains the value of the data item with a known value (not ‘?’ or ‘.’) if more than one data item is found in the same data block, and the rest of the values of the data item are unknown;
fix_string_quotes: puts more than one unquoted values following a non-loop data item in quotes;
allow_uqstring_brackets: puts unquoted strings starting with opening square bracket ([) in single quotes;
fix_ctrl_z: removes ^Z symbols;
fix_non_ascii_symbols: encodes non-ASCII symbols using numeric character references;
fix_missing_closing_double_quote: inserts an appropriate quote where a missing single or double closing quote is detected.
Usage example:
parse( file, { ‘fix_data_header’ : 1 } )
All other options are turned on/off likewise.
Data structure
The data blocks of parsed CIF files are stored in associative arrays with the following keys:
name (string): name of a CIF data block;
tags (array): data names present in the CIF data block (in lowercase);
values (associative array): keys are the values of the tags array, values are arrays containing values of each data item;
types (associative array): keys are the values of the tags array, values are arrays containing lexically derived data types of each data value;
precisions (associative array): keys are the values of the tags array, values are arrays containing standard uncertainties for each data item;
loops (array of arrays): each inner array corresponds to a loop from the CIF data block and contains a list of data items present in the loop;
inloop (associative array): keys are the values of the tags array, values correspond to indices of the outer loops array. It is used as an index to optimize data item-in-loop related searches;
save_blocks (array of associative arrays): list of CIF save frames, where every frame is represented using a data structure identical to a CIF data block;
cifversion (associative array): has keys major and minor, corresponding to the major and minor versions of CIF format, currently 1.1 or 2.0.
Further reading
Merkys, A., Vaitkus, A., Butkus, J., Okulič-Kazarinas, M., Kairys, V. & Gražulis, S. (2016) “COD::CIF::Parser: an error-correcting CIF parser for the Perl language”. Journal of Applied Crystallography 49. doi:10.1107/S1600576715022396
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pycodcif-3.0.1.tar.gz
.
File metadata
- Download URL: pycodcif-3.0.1.tar.gz
- Upload date:
- Size: 79.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 743f5bbe79129b51967c6b571d3e1804e36bae587cc8034f0b31617783cc34a3 |
|
MD5 | d6e24951f854d3f986f452cb35f1ea67 |
|
BLAKE2b-256 | 577a988b414e31c3eabfaa781c1647804be9a7b33583917269a388bfe0cd516c |