Skip to main content

Python library for high-throughput .cif analysis

Project description

cifkit

CI codecov Python - Version PyPi version Conda version

Logo light mode Logo dark mode

cifkit is designed to provide a set of fully-tested utility functions and variables for handling large datasets, on the order of tens of thousands, of .cif files.

The current codebase and documentation are actively being improved as of Sep 18, 2024.

Features:

cifkit provides higher-level functions in just a few lines of code.

  • Coordination geometry - cifkit provides functions for visualing coordination geometry from each site and extracts physics-based features like volume and packing efficiency in each polyhedron.
  • Atomic mixing - cifkit extracts atomic mixing information at the bond pair level—tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker.
  • Filter - cifkit offers features for preprocessing. It systematically addresses common issues in CIF files from databases, such as incorrect loop values and missing fractional coordinates, by standardizing and filtering out ill-formatted files. It also preprocesses atomic site labels, transforming labels such as 'M1' to 'Fe1' in files with atomic mixing.
  • Sort - cifkit allows you to copy, move, and sort .cif files based on attributes such as coordination numbers, space groups, unit cells, shortest distances, elements, and more.

Example usage 1 - coordination geometry

The example below uses cifkit to visualize the polyhedron generated from each atomic site based on the coordination number geometry.

from cifkit import Cif

cif = Cif("your_cif_file_path")
site_labels = cif.site_labels

# Loop through each site label
for label in site_labels:
    # Dipslay each polyhedron, .png saved for each label
    cif.plot_polyhedron(label, is_displayed=True)

Polyhedron generation

Example Usage 2 - sort

The following example generates a distribution of structure.

from cifkit import CifEnsemble

ensemble = CifEnsemble("your_folder_path_containing_cif_files")
ensemble.generate_structure_histogram()

structure distribution

Basde on your visual histogram above, you can copy and move .cif files based on specific attributes:

# Return file paths matching structures either Co1.75Ge or CoIn2
ensemble.filter_by_structures(["Co1.75Ge", "CoIn2"])

# Return file path matching CeAl2Ga2
ensemble.filter_by_structures("CeAl2Ga2")

To learn more, please read the official documentation here: https://bobleesj.github.io/cifkit.

Quotes

Here is a quote illustrating how cifkit addresses one of the challenges mentioned above.

"I am building an X-Ray diffraction analysis (XRD) pattern visualization script for my lab using pymatgen. I feel like cifkit integrated really well into my existing stable of libraries, while surpassing some alternatives in preprocessing and parsing. For example, it was often unclear at what stage an error occurred—whether during pre-processing with CifParser, or XRD plot generation with diffraction.core in pymatgen. The pre-processing logic in cifkit was communicated clearly, both in documentation and in actual outputs, allowing me to catch errors in my data before it was used in my visualizations. I now use cifkit by default for processing CIFs before they pass through the rest of my pipeline." - Alex Vtorov `

Documentation

How to contribute

Here is how you can contribute to the cifkit project if you found it helpful:

  • Star the repository on GitHub and recommend it to your colleagues who might find cifkit helpful as well. Star GitHub repository
  • Create a new issue for any bugs or feature requests here
  • Fork the repository and consider contributing changes via a pull request. Fork GitHub repository. Check out CONTRIBUTING.md for instructions.
  • If you have any suggestions or need further clarification on how to use cifkit, please reach out to Bob Lee (@bobleesj).

To render documentation

pip install -r requirements/docs.txt
mkdocs serve

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cifkit-1.0.5rc0.tar.gz (4.0 MB view details)

Uploaded Source

Built Distribution

cifkit-1.0.5rc0-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file cifkit-1.0.5rc0.tar.gz.

File metadata

  • Download URL: cifkit-1.0.5rc0.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cifkit-1.0.5rc0.tar.gz
Algorithm Hash digest
SHA256 e6d0c64c5720aaa62570504dce32acabeef7903da7b06fcc2d8a79e5d70ec0d3
MD5 d5bff8d7c5e158c616e6ba1e9ddfe6b5
BLAKE2b-256 1e87251831901423e1c5b3c739da7b76ea7b7384aee975a336fec389e73aa137

See more details on using hashes here.

File details

Details for the file cifkit-1.0.5rc0-py3-none-any.whl.

File metadata

  • Download URL: cifkit-1.0.5rc0-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cifkit-1.0.5rc0-py3-none-any.whl
Algorithm Hash digest
SHA256 7fc5c750100630592bf1c85e39ad7b5a2be152080366cdac2e5fbb45e3fb777c
MD5 7e9a077c17a6895bc596778fcddc5787
BLAKE2b-256 eeabdfc0524ac7c6f6e5dec1d38c227f5eb7d2c2eef15d6d85bfffbc5774e5ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page