Skip to main content

CML tool filtering line-of-sights contacts

Project description

filter_los_csd

This is the manual for the script which filters line-of-sight contacts from the results of ConQuest search over the Cambridge Structure Database (CSD) entries. The script is available via GitHub. For the underlying scientific context see the corresponding work: DOI 10.1002/cphc.201901083.

If you use the script in your work, please cite generously!

I. Yu. Chernyshov, I. V. Ananyev, E.A. Pidko, Revisiting van der Waals Radii: From Comprehensive Structural Analysis to Knowledge‐Based Classification of Interatomic Contacts. ChemPhysChem 2020, 21, 370–376, DOI: 10.1002/cphc.201901083.

Intro

This manual consists of two parts, describing installation details and syntax of the script. The example of input and output files for the script is provided in example directory.

Installation

filter_los_csd is a multiplatform Python3 package. Use pip to install it:

> pip install filter_los_csd

filter_los_csd depends on the PyCifRW package, which requires a C/C++ compiler, and pip installation will fail if you do not have one. The best choice for Windows is Visual Studio Building Tools, and for Linux, gcc would be enough. If you forget about it, pip will give you the installation error and platform-specific advice on fixing it.

A Permission denied error may occur when running the script in some versions of Linux:

> filter_los_csd.py input.csv input.cif
-bash: /path/to/script/filter_los_csd.py: Permission denied

To fix it, set umask to 022 before script installation (see ref for more details), or just change permissions:

> sudo chmod 755 /path/to/script/filter_los_csd.py

Syntax

Input and Output

The script takes two necessary parameters as input:

  • path_csv: path to CSV file, containing info on parameters of contacts in question. Please note, that CSV file must contain labels of atoms forming the contact and the contact distance!
  • path_cif: path to multiple CIF file, containing CSD entries from CSV file. Please note, that CIF file must contain info on bonds. Make sure that the corresponding checkbox was selected before downloading CIF. Also note, that script were tested only for CIF files extracted from CSD (ConQuest)!

Thus, the easiest command is:

> filter_csd_los.py test.csv test.cif

As output the script creates {csv_name}_los.csv file containing the same info as the original file with three additional columns:

  • "LOS": has four possible values:
    • "+": the corresponding contact is line-of-sight;
    • "–": the corresponding contact is not line-of-sight;
    • "?": the corresponding crystal is larger than specified cell volume cutoff;
    • "!": the error occurred during the calculation. The only type of errors caught during the testing is inability to find a contact with given atomic labels and distance. The main source of these errors are disorder issues, thus tuning --tol parameter can solve the problem.
  • "SHIELDING": contact shielding value (the definition of the term is given in the manuscript);
  • "SHIELD_ATOM": label of the shielding atom.

The contact and the shielding atom can be visualized in two steps:

  1. find the contact by atomic labels and distance;
  2. the nearest atom to the contact line with SHIELD_ATOM label is a shielding atom.

Optional parameters

In addition, the script has several optional parameters:

  • -r or --radii: type of van der Waals radii used for shielding calculation, "csd" by default. Available values are:
    • "csd": version usually used in CCDC products (ConQuest, Mercury, etc). It is same as Bondi's version but with r(H) = 1.09 Å; unknown atomic radii was set to 2.0 Å;
    • "bondi": Bondi version; unknown atomic radii set to 2.0 Å;
    • "rt": Rowland&Taylor version, unknown atomic radii set to "csd" values;
    • "alv": Alvarez version;
    • "chap": Chernyshov&Ananyev&Pidko (this work) version.
  • --lab1: name of CSV file column containing labels of the contact's first atom, "LAB1" by default;
  • --lab2: name of CSV file column containing labels of the contact's second atom, "LAB2" by default;
  • --dist: name of CSV file column containing contact distances, "DIST1" by default;

Please make sure, that --lab1, --lab2 and --dist values correspond to those in CSV file.

  • --norm: type of X–H bonds normalization, "csd" by default. Please make sure, that the same normalization scheme was used for a ConQuest search. Available values are:
    • "csd": C–H, N–H and O–H bonds are normalized to 1.089 Å, 1.015 Å and 0.993 Å, correspondingly;
    • "no": X–H bonds are not normalized;
    • path to the file, each line of those contains space separated element symbol and the length of corresponding X–H bond in angstroms, e.g. "C 1.09".
  • --tol: minimal possible distance between different atoms, default 0.005 Å.
  • -V or --volume: maximal allowed cell volume, Å3. Available values are:
    • positive numeric value: in this case the script does not calculate contact shielding for crystals with crystallographic cell volume more than specified. It can be useful if there are a lot of crystals with large crystallographic cells (V > 10 000 Å3) which are treated slowly.
    • unspecified: no filtering by volume is applied.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filter_los_csd-1.5.1.tar.gz (29.1 kB view details)

Uploaded Source

Built Distribution

filter_los_csd-1.5.1-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file filter_los_csd-1.5.1.tar.gz.

File metadata

  • Download URL: filter_los_csd-1.5.1.tar.gz
  • Upload date:
  • Size: 29.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for filter_los_csd-1.5.1.tar.gz
Algorithm Hash digest
SHA256 a2f228b7e076c14fb4b0db2432c9d889fecd3571ea3d95a28dde3c416b46bbbe
MD5 3d2b9d1a356b184f843d7cb91b520d26
BLAKE2b-256 35fcf43fc0dc9602d7c9a7cb1038c04b0ffc1226befef199bda9dcc948bea659

See more details on using hashes here.

File details

Details for the file filter_los_csd-1.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for filter_los_csd-1.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 46eaf915a26b1b16bf197b8cf8b2b7ab5f12f71399d1dfd45175c4bb25d2245d
MD5 a94f67b89e1c262bcd5f2f92fe5c6a0d
BLAKE2b-256 2fcfd3e1b3f7981191adf636cdb1833c7022edb13298e4103ebd67899b1829ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page